The effect of using normalized models in statistical speech synthesis

Zen, Heiga; Shannon, Matt; Byrne, William Joseph

The effect of using normalized models in statistical speech synthesis

dc.creator	Shannon, Matt
dc.creator	Zen, Heiga
dc.creator	Byrne, William Joseph
dc.date.accessioned	2018-11-24T13:11:52Z
dc.date.available	2013-04-09T19:18:33Z
dc.date.available	2018-11-24T13:11:52Z
dc.date.issued	2011-08-27
dc.identifier	http://www.dspace.cam.ac.uk/handle/1810/244406
dc.identifier.uri	http://repository.aust.edu.ng/xmlui/handle/123456789/3039
dc.description.abstract	The standard approach to HMM-based speech synthesis is inconsistent in the enforcement of the deterministic constraints between static and dynamic features. The trajectory HMM and autoregressive HMM have been proposed as normalized models which rectify this inconsistency. This paper investigates the practical effects of using these normalized models, and examines the strengths and weaknesses of the different models as probabilistic models of speech. The most striking difference observed is that the standard approach greatly underestimates predictive variance. We argue that the normalized models have better predictive distributions than the standard approach, but that all the models we consider are still far from satisfactory probabilistic models of speech. We also present evidence that better intra-frame correlation modelling goes some way towards improving existing normalized models.
dc.language	en
dc.publisher	ISCA (International Speech Communication Association)
dc.rights	http://creativecommons.org/licenses/by/2.0/uk/
dc.rights	Attribution 2.0 UK: England & Wales
dc.subject	HMM-based speech synthesis
dc.subject	acoustic modelling
dc.subject	autoregressive HMM
dc.subject	trajectory HMM
dc.subject	normalization
dc.title	The effect of using normalized models in statistical speech synthesis
dc.type	Conference Object

Files in this item

Files	Size	Format	View
shannon2011effect.pdf	820.6Kb	application/pdf	View/Open

This item appears in the following Collection(s)

School of Technology369

Show simple item record

The effect of using normalized models in statistical speech synthesis

Files in this item

This item appears in the following Collection(s)

School of Technology369