Mastering Overdetection and Underdetection in Learner-Answer Processing

Simple Techniques for Analysis and Diagnosis

Authors

  • Alexia Blanchard
  • Olivier Kraif
  • Claude Ponton

DOI:

https://doi.org/10.1558/cj.v26i3.592-610

Keywords:

CALL, Language Learning, Error Diagnosis, Error Feedback

Abstract

This paper presents a "didactic triangulation" strategy to cope with the problem of reliability of NLP applications for computer-assisted language learning (CALL) systems. It is based on the implementation of basic but well mastered NLP techniques and puts the emphasis on an adapted gearing between computable linguistic clues and didactic features of the evaluated activities. We claim that a correct balance between false positives (i.e., false error detection) and false negatives (i.e., undetected errors) is not only an outcome of NLP techniques, but also of an appropriate didactic integration of what NLP can do well--and what it cannot do well. Based on this approach, ExoGen is a prototype for generating activities such as gap-fill exercises. It integrates a module for error detection and description which checks learners' answers against expected ones. Through the analysis of graphic, orthographic, and morphosyntactic differences, it is able to diagnose problems like spelling errors, lexical mix-ups, agreement errors, conjugation errors, and so on. The first evaluation of ExoGen outputs, based on the FRIDA learner corpus, has yielded very promising results, paving the way for the development of an efficient and general model adaptable to a wide variety of activities.

References

Anctil, D. (2005). Maîtrise du lexique chez les étudiants universitaires : typologie des problèmes lexicaux et analyse des stratégies de résolution de problèmes lexicaux [Lexical learning in university students: Typology of lexical problems and analysis of resolution strategies for lexical problems]. Mémoire de M.A., Faculté des Sciences de l’Education, Université de Montréal, Québec, Canada.

Andre, T. (1986). Problem solving and education. San Diego, CA: Academic Press.

Antoniadis, G., Kraif, O., Ponton, C., & Zampa, V. (in press). Un outil exploratoire de corpus d’apprenants. [An exploratory tool for learners corpus]. Proceedings of UNTELE’07, University of Compiègne, France, 29-31 mars 2007.

Bar-Hillel, Y. (1964). The future of machine translation. In Y. Bar-Hillel (Ed.), Language and information: Selected essays on their theory and application (pp. 180-184). London: Addison-Wesley.

Charnet, C., & Panckhurst, R. (1998). Le correcteur grammatical: un auxiliaire efficace pour l’enseignant? Quelques éléments de réflexion [Grammatical corrector: An effective ancillary for the learner? Some reflections]. ALSIC, 1, 103-114. Retrieved April 1, 2009, from http://alsic.u-strasbg.fr/Num2/panck/alsic_n02-rec3.htm

Cordier-Gauthier, C., & Dion, C. (2003). Correction et révision de l’écrit en français langue seconde : médiation humaine, médiation informatique [Correction and written revision in French as a second langauge: Human mediation and computer mediation]. ALSIC, 6, 29-43. Retrieved April 1, 2009, from http://alsic.u-strasbg.fr/Num10/cordier/alsic_n10-rec5.htm

Désilets, M. (1998). Que penser de l’utilisation des logiciels correcteurs à l’école? [What to think about the use of correcting software in schools?]. Vie pédagogique, 107, 9-11.

Granger, S., Vandeventer, A., & Hamel, M.-J. (2001). Analyse des corpus d’apprenants pour l’ELAO base sur le TAL [Analysis of learners corpora for CALL based on TAL]. TAL, 42, 609-621.

Heift, T., & Schulze, M. (2007). Errors and intelligence in computer-assisted language learning: Parsers and pedagogues. New York: Routledge.

Johns, T. (1993). Data-driven learning: An update. TELL&CALL 1993, 2, 4-10.

Kraif, O. (2001). Exploitation des cognats dans les systèmes d’alignement bi-textuel : Architecture et évaluation [Using cognates in bitextual alignment systems: Architecture and evaluation]. TAL, 42, 833-867.

Kraif, O. (2006). Extraction automatique de lexique bilingue : application pour la recherche d’exemples en lexicographie [Automatic extraction of bilingual lexicons: Application for researching examples in lexicography] . Journées du CRTT, Université Lyon 2, Lyon, France.

Kraif, O., Antoniadis, G., Echinard, S., Loiseau, M., Lebarbé, T., & Ponton, C. (2004). NLP tools for CALL: The simpler, the better. In Proceedings of InSTIL/ICALL 2004 Symposium, NLP and Speech Technologies in Advanced Language Learning Systems (pp. 37-40). Venice, Italy: International Speech Communication.

L’Haire, S. (2004). Vers un feed-back plus intelligent, les enseignements du projet Freetext [Towards a more intelligent feedback: What has been learned from the Freetext Project]. In Proceedings of Traitement Automatique des Langues et Apprentissage des Langues (TALAL) (pp. 1-12). Grenoble, France: LIDILEM Université Stendhal Grenoble 3. Retrieved May 28, 2008, from http://w3.u-grenoble3.fr/lidilem/talal/actes/JourneeTALAL-041022-lhaire.pdf

Meunier, L. E. (2000). La typologie des intelligences humaine et artificielle : complexité pédagogique de l’enseignement des langues étrangères dans un environnement multimédia [Typology of human and artificial intelligence: Pedagogical complexity in teaching foreign languages in a multimedia environment]. In L. Duquette et M. Laurier (Eds.), Apprendre une langue dans un environnement multimédia [Learning a language in a multimedia environment] (pp. 211-253). Outrement (Québec): Les éditions Logiques.

Paroubek, P., Vilnat, A., Robba, I., & Ayache, C. (2007). Les résultats de la campagne, EASY d’évaluation des analyseurs syntaxiques du français [Results of the EASY campaign: Evaluation of syntactic analyzers of French]. In N. Hathout & P. Muller (Eds.), Actes des ateliers de la 14e Conférence annuelle sur le Traitement Automatique des Langues Naturelles (TALN) [Acts of the workshops of the 14th annual conference on the natural language processing], vol. 2 (pp. 242-252). Toulouse, France: IRIT Press.

Rénié, D., & Chanier, T. (1993). La modélisation de l’acquisition, une étape dans la construction de systèmes d’EIAO des langues : le cas des interrogatives en français langue seconde [Modeling acquisition, a stage in the development of ICALL systems]. In M. Baron, R. Gras, & J. F. Nicaud (Eds.), Environnements interactifs d’apprentissage avec ordinateur [Computer-based interactive learning environments], Tome 1 (pp. 123-134). Paris: Eyrolles.

Rézeau, J. (2001). Médiatisation et médiation pédagogique dans un environnement multimédia. Le cas de l’apprentissage de l’anglais en Histoire de l’art à l’université [Media coverage and pedagogical mediation in a multimedia environment. The case of learning English in Art History in the university], Unpublished doctoral disssertation, University of Bordeaux II, France.

Rüschoff, B. (2005). Data-driven learning (DDL): The idea. In T. Fitzpatrick, A. Lund, B. Moro, & B. Rüschoff (Eds.), Information and communication technologies in vocationally oriented language learning (pp. 63-76). European Centre for Modern Languages, Graz: Council of Europe. Retrieved July, 30, 2008, from http://www.ecml.at/documents/pub131aE2003_Fitzpatrick_withoutBookmarksAndCover.pdf

Selva, T., & Chanier, T. (2000). Génération automatique d’activités lexicales dans le système ALEXIA [Automatic generation of lexical activities in the ALEXIA system]. Sciences et Techniques Educatives (STE), 7, 385-412.

Wyatt, D. H. (1987). Applying pedagogical principles to CALL courseware development. In Wm. Flint Smith (Ed.), Modern media in foreign language education (pp. 85-98). Lincolnwood, IL: National Textbook Company.

Downloads

Published

2013-01-14

Issue

Section

Articles

How to Cite

Blanchard, A., Kraif, O., & Ponton, C. (2013). Mastering Overdetection and Underdetection in Learner-Answer Processing: Simple Techniques for Analysis and Diagnosis. CALICO Journal, 26(3), 592-610. https://doi.org/10.1558/cj.v26i3.592-610

Most read articles by the same author(s)

1 2 3 4 5 6 7 8 9 10 > >>