TY - GEN
T1 - Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS-2007" for the Blizzard Challenge 2007
AU - Yamagishi, Junichi
AU - Nose, Takashi
AU - Zen, Heiga
AU - Toda, Tomoki
AU - Tokuda, Keiichi
PY - 2008
Y1 - 2008
N2 - This paper describes a speaker-independent/adaptive HMM-based speech synthesis system developed for the Blizzard Challenge 2007. The new system, named "HTS-2007", employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than that of speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available.
AB - This paper describes a speaker-independent/adaptive HMM-based speech synthesis system developed for the Blizzard Challenge 2007. The new system, named "HTS-2007", employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than that of speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available.
KW - Blizzard Challenge
KW - HMM
KW - HTS
KW - Speaker adaptation
KW - Speech synthesis
UR - http://www.scopus.com/inward/record.url?scp=51449103919&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51449103919&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2008.4518520
DO - 10.1109/ICASSP.2008.4518520
M3 - Conference contribution
AN - SCOPUS:51449103919
SN - 1424414849
SN - 9781424414840
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 3957
EP - 3960
BT - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
T2 - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Y2 - 31 March 2008 through 4 April 2008
ER -