Construction of a Rated Speech Corpus of L2 Learners' Spontaneous Speech

Authors

  • Su-Youn Yoon
  • Lisa Pierce
  • Amanda Huensch
  • Eric Juul
  • Samantha Perkins
  • Richard Sproat
  • Mark Hasegawa-Johnson

DOI:

https://doi.org/10.1558/cj.v26i3.662-673

Keywords:

Rated Speech Corpus, L2, Automated Scoring

Abstract

This work reports on the construction of a rated database of spontaneous speech produced by second language (L2) learners of English. Spontaneous speech was collected from 28 L2 speakers representing six language backgrounds and five different proficiency levels. Speech was elicited using formats similar to that of the TOEFL iBT and the Speaking Proficiency English Assessment Kit (SPEAK) test. A total of 182 minutes of spontaneous speech were collected, segmented, and assessed by two phonetically trained, experienced ESL instructors. The raters assigned a general fluency score and phone accuracy score with additional detailed comments on pronunciation errors. This database was designed with several applications in mind: the development of computer-aided pronunciation and fluency training, automatic assessment of fluency and pronunciation, and as a tool for researchers working in automatic speech recognition and for linguists more generally. This database will be released to the public in the near future.

References

Boersma, P., & Weenink, D. (2006). Praat: Doing phonetics by computer (Version 4.5.02) [Computer program]. Retrieved November 16, 2006, from http://www.praat.org

Bratt, H., Neumeyer, L., Shriberg, E., & Franco, H. (1998). Collection and detailed transcription of a speech database for development of language learning technologies. In Proceedings of International Conference on Spoken Language Processing (pp. 1539-1542). Sydney, Australia.

Educational Testing Service. (2006). The official guide to the new TOEFL iBT. New York: McGraw-Hill.

Hasegawa-Johnson, M., & Fleck, M. (2007). International Speech Lexicon (Version 0.2.0) [Online dictionary]. Retrieved January 3, 2008, from http://www.isle.uiuc.edu/dict

Kim, Y., Franco, H., & Neumeyer, L. (1997). Automatic pronunciation scoring of specific phone segments for language instruction. In Proceedings of European Conference on Speech Communication and Technology (pp. 649-652). Rhodes, Greece.

Mizera, G. (2006). Working memory and L2 oral fluency. Unpublished Ph.D. dissertation, University of Pittsburgh, Pittsburgh, PA.

Swan, M., & Smith, B. (2002). Learner English. Cambridge: Cambridge University Press.

Witt, S., & Young, S. (1998). Performance measures for phone-level pronunciation teaching in CALL. In Proceedings of the Workshop on Speech Technology in Language Learning (pp. 99-102). Budapest, Hungary.

Witt, S., & Young, S. (2000). Phone-level pronunciation scoring and assessment for interactive learning. Speech Communication, 30(2-3), 95-108.

Xi, X., & Mollaun, P. (2006). Investigating the utility of analytic scoring for the TOEFL Academic Speaking Test (TAST) (RR-06-07, TOEFLiBT-01). Princeton, NJ: Educational Testing Service.

Downloads

Published

2013-01-14

Issue

Section

Articles

How to Cite

Yoon, S.-Y., Pierce, L., Huensch, A., Juul, E., Perkins, S., Sproat, R., & Hasegawa-Johnson, M. (2013). Construction of a Rated Speech Corpus of L2 Learners’ Spontaneous Speech. CALICO Journal, 26(3), 662-673. https://doi.org/10.1558/cj.v26i3.662-673

Most read articles by the same author(s)

1 2 3 4 5 6 7 8 9 10 > >>