Unsupervised intralingual and cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction

Byrne, William Joseph; Gibson, Matthew

Unsupervised intralingual and cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction

dc.creator	Gibson, Matthew
dc.creator	Byrne, William Joseph
dc.date.accessioned	2018-11-24T13:10:50Z
dc.date.available	2010-08-25T16:02:46Z
dc.date.available	2018-11-24T13:10:50Z
dc.date.issued	2010
dc.identifier	http://www.dspace.cam.ac.uk/handle/1810/226328
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/123456789/2833
dc.description.abstract	Hidden Markov model (HMM)-based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervised speaker adaptation, previous work has used a supplementary set of acoustic models to estimate the transcription of the adaptation data. This paper firstly presents an approach to the unsupervised speaker adaptation task for HMM-based speech synthesis models which avoids the need for such supplementary acoustic models. This is achieved by defining a mapping between HMM-based synthesis models and ASR-style models, via a two-pass decision tree construction process. Secondly, it is shown that this mapping also enables unsupervised adaptation of HMM-based speech synthesis models without the need to perform linguistic analysis of the estimated transcription of the adaptation data. Thirdly, this paper demonstrates how this technique lends itself to the task of unsupervised cross-lingual adaptation of HMM-based speech synthesis models, and explains the advantages of such an approach. Finally, listener evaluations reveal that the proposed unsupervised adaptation methods deliver performance approaching that of supervised adaptation.
dc.publisher	IEEE Transactions on Audio, Speech and Language Processing
dc.subject	HMM-based speech synthesis
dc.subject	unsupervised speaker adaptation
dc.subject	cross-lingual
dc.title	Unsupervised intralingual and cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction
dc.type	Article

Files in this item

Files	Size	Format	View
CrossLingualAdapt.final.twocol.pdf	398.0Kb	application/pdf	View/Open

This item appears in the following Collection(s)

School of Technology369

Show simple item record

Unsupervised intralingual and cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction

Files in this item

This item appears in the following Collection(s)

School of Technology369