Automatic Transcription of Multi-genre Media Archives

Bell, PJ; Hain, T; Seigel, MS; Renals, S; Woodland, Philip Charles; Lanchantin, Pierre Kim; Saz, O; Gales, Mark John; Swietojanski, S; Long, Y; Quinnell, J; Liu, X

Automatic Transcription of Multi-genre Media Archives

dc.creator	Lanchantin, Pierre Kim
dc.creator	Bell, PJ
dc.creator	Gales, Mark John
dc.creator	Hain, T
dc.creator	Liu, X
dc.creator	Long, Y
dc.creator	Quinnell, J
dc.creator	Renals, S
dc.creator	Saz, O
dc.creator	Seigel, MS
dc.creator	Swietojanski, S
dc.creator	Woodland, Philip Charles
dc.date.accessioned	2018-11-24T13:11:58Z
dc.date.available	2013-07-22T13:56:29Z
dc.date.available	2018-11-24T13:11:58Z
dc.date.issued	2013
dc.identifier	http://www.dspace.cam.ac.uk/handle/1810/244726
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/123456789/3059
dc.description.abstract	This paper describes some recent results of our collaborative work on developing a speech recognition system for the automatic transcription or media archives from the British Broadcasting Corporation (BBC). The material includes a wide diversity of shows with their associated metadata. The latter are highly diverse in terms of completeness, reliability and accuracy. First, we investigate how to improve lightly supervised acoustic training, when timestamp information is inaccurate and when speech deviates significantly from the transcription, and how to perform evaluations when no reference transcripts are available. An automatic timestamp correction method as well as a word and segment level combination approaches between the lightly supervised transcripts and the original programme scripts are presented which yield improved metadata. Experimental results show that systems trained using the improved metadata consistently outperform those trained with only the original lightly supervised decoding hypotheses. Secondly, we show that the recognition task may benefit from systems trained on a combination of in-domain and out-of-domain data. Working with tandem HMMs, we describe Multi-level Adaptive Networks, a novel technique for incorporating information from out-of domain posterior features using deep neural network. We show that it provides a substantial reduction in WER over other systems including a PLP-based baseline, in-domain tandem features, and the best out-of-domain tandem features.
dc.language	en
dc.rights	http://creativecommons.org/licenses/by-nc/2.0/uk/
dc.rights	Attribution-NonCommercial 2.0 UK: England & Wales
dc.title	Automatic Transcription of Multi-genre Media Archives
dc.type	Conference Object

Files in this item

Files	Size	Format	View
Lanchantin 2013.pdf	115.9Kb	application/pdf	View/Open

This item appears in the following Collection(s)

School of Technology369

Show simple item record

Automatic Transcription of Multi-genre Media Archives

Files in this item

This item appears in the following Collection(s)

School of Technology369