English speech database read by Japanese learners for CALL system development

N. Minematsu, Y. Tomiyama, K. Yoshimoto, K. Shimizu, S. Nakagawa, M. Dantsuji, S. Makino

Research output: Contribution to conferencePaperpeer-review

25 Citations (Scopus)


With the help of recent advances in speech processing techniques, we can see various kinds of practical speech applications in both laboratories and the real world. One of the major applications in Japan is CALL (Computer Assisted Language Learning) systems. It is well-known that most of the recent speech technologies are based upon statistical methods, which require a large amount of speech data. Although we can find many speech corpora available from distribution sites such as Linguistic Data Consortium, European Language Resources Association, and so on, the number of speech corpora built especially for CALL system development is very small. In this paper, we firstly introduce a Japanese national project of "Advanced Utilization of Multimedia to Promote Higher Educational Reform," under which some research groups are currently developing CALL systems. One of the main objectives of the project is to construct an English speech database read by Japanese students for CALL system development. This paper describes specification of the database and strategies adopted to select speakers and record their sentence/word utterances in addition to preliminary discussions and investigations done before the database development. Further, by using the new database and WSJ database, corpus-based analysis and comparison between Japanese English and American English is done in view of the entire phonemic system of English. Here, tree diagrams of the two kinds of English are drawn through their HMM sets. Results show many interesting characteristics of Japanese English.

Original languageEnglish
Number of pages8
Publication statusPublished - 2002
Externally publishedYes
Event3rd International Conference on Language Resources and Evaluation, LREC 2002 - Las Palmas, Canary Islands, Spain
Duration: 2002 May 292002 May 31


Other3rd International Conference on Language Resources and Evaluation, LREC 2002
CityLas Palmas, Canary Islands

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Education
  • Library and Information Sciences


Dive into the research topics of 'English speech database read by Japanese learners for CALL system development'. Together they form a unique fingerprint.

Cite this