Judging Grammaticality: Experiments in Sentence Classification

Joachim Wagner; Jennifer Foster; Josef van Genabith

doi:10.1558/cj.v26i3.474-490

Authors

Joachim Wagner
Jennifer Foster
Josef van Genabith

DOI:

https://doi.org/10.1558/cj.v26i3.474-490

Keywords:

Grammar Checker, Error Detection, Natural Language Parsing, Probabilistic Grammars, Precision Grammars, Decision Tree Learning, Voting Classifiers, N-gram Models, Learner Corpora

Abstract

A classifier which is capable of distinguishing a syntactically well formed sentence from a syntactically ill formed one has the potential to be useful in an L2 language-learning context. In this article, we describe a classifier which classifies English sentences as either well formed or ill formed using information gleaned from three different natural language processing techniques. We describe the issues involved in acquiring data to train such a classifier and present experimental results for this classifier on a variety of ill formed sentences. We demonstrate that (a) the combination of information from a variety of linguistic sources is helpful, (b) the trade-off between accuracy on well formed sentences and accuracy on ill formed sentences can be fine tuned by training multiple classifiers in a voting scheme, and (c) the performance of the classifier is varied, with better performance on transcribed spoken sentences produced by less advanced language learners.

References

Albrecht, J. S., & Hwa, R. (2007). A re-examination of machine learning approaches for sentence-level MT evaluation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (pp. 880-887). Prague, Czech Republic: Association for Computational Linguistics.

Andersen, O. E. (2007). Grammatical error detection using corpora and supervised learning. In V. Nurmi & D. Sustretov (Eds.), Proceedings of the Twelfth ESSLLI Sudent Session (pp. 1-9). Dublin, Ireland.

Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36(1-2), 105-139.

Bender, E. M., Flickinger, D., Oepen, S., & Baldwin, T. (2004). Arboretum: Using a precision grammar for grammar checking in CALL. In R. Delmonte, P. Delcloque, & S. Tonelli (Eds.), Proceedings of the InSTIL/ICALL Symposium: NLP and Speech Technologies in Advanced Language Learning Systems (pp. 83-86). Venice, Italy.

Bigert, J. (2004). Probabilistic detection of context-sensitive spelling errors. In Proceedings of the 4th International Conference on Language Resources and Evaluation (Vol. 5, pp. 1633-1636). Lisbon, Portugal: European Language Resources Association.

Bigert, J., & Knutsson, O. (2002). Robust error detection: A hybrid approach combining unsupervised error detection and linguistic knowledge. In Proceedings of the 2nd Workshop on Robust Methods in Analysis of Natural Language Data (Romand). Frascati, Italy. Retrieved May 4, 2009, from http://www.johnnybigert.se/publications.html

Bigert, J., Sjöbergh, J., Knutsson, O., & Sahlgren, M. (2005). Unsupervised evaluation of parser robustness. In A. Gelbukh (Ed.), Proceedings of the Sixth International Conference on Intelligent Text Processing and Computational Linguistics (CICling) (pp. 142-154). Mexico City, Mexico: Springer.

Black, E., Abney, S., Flickenger, S., Gdaniec, C., Grishman, C., Harrison, P., et al. (1991). Procedure for quantitatively comparing the syntactic coverage of English grammars. In E. Black (Ed.), Proceedings of the HLT Workshop on Speech and Natural Language (pp. 306-311). Morristown, NJ: Association for Computational Linguistics.

Breiman, L. (1996). Heuristics of instability and stabilization in model selection. Annals of Statistics, 24(6), 2350-2383.

Brockett, C., Dolan, W. B., & Gamon, M. (2006). Correcting ESL errors using phrasal SMT techniques. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 249-256). Sydney, Australia: Association for Computational Linguistics.

Burnard, L. (2000). User reference guide for the British National Corpus (Technical Report). Oxford University Computing Services.

Butt, M., Dyvik, H., King, T. H., Masuichi, H., & Rohrer, C. (2002). The parallel grammar project. In Proceedings of COLING-2002 workshop on Grammar Engineering and Evaluation (pp. 1-7). Morristown, NJ: Association for Computational Linguistics.

Charniak, E., & Johnson, M. (2005). Course-to-fine n-best-parsing and maxent discriminative reranking. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (pp. 173-180). Ann Arbor, Michigan: Association for Computational Linguistics.

De Felice, R., & Pulman, S. (2008). A classifier-based approach to preposition and determiner error correction in L2 English. In Proceedings of COLING (pp. 169-176). Manchester, UK: Coling 2008

Organizing Committee.

Foster, J. (2005). Good reasons for noting bad grammar: Empirical investigations into the parsing of ungrammatical written English. Unpublished doctoral dissertation, Trinity College, University of Dublin.

Foster, J. (2007). Treebanks gone bad: Parser evaluation and retraining using a treebank of ungrammatical sentences. International Journal on Document Analysis and Recognition, 10(3-4), 129-145.

Foster, J., Wagner, J., & Genabith, J. van. (2008). Adapting a WSJ-trained parser to grammatically noisy text. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: ACL-HLT-08 short papers volume (pp. 221-225). Columbus, OH: Association for Computational Linguistics.

Gamon, M., Gao, J., Brockett, C., Klementiev, A., Dolan, W. B., Belenko, D., et al. (2008). Using contextual speller techniques and language modeling for ESL error correction. In Proceedings of the Third International Joint Conference on Natural Language Processing (pp. 449-455). Hyderabad, India: Asian Federation of Natural Language Processing.

Golding, A. R., & Schabes, Y. (1996). Combining trigram-based and feature-based methods for contextsensitive spelling correction. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics (pp. 71-78). Santa Cruz, CA: Association for Computational Linguistics.

Granger, S. (1993). International corpus of learner English. In J. Aarts, P. de Haan, & N. Oostdijk (Eds.), English language corpora: Design, analysis and exploitation (pp. 57-71). Amsterdam: Rodopi.

Han, N.-R., Chodorow, M., & Leacock, C. (2006). Detecting errors in English article usage by non-native speakers. Natural Language Engineering, 12(2), 115-129.

Horváth, J. (1999). Advanced writing in English as a foreign language: A corpus-based study of processes and products. Unpublished doctoral dissertation, Janus Pannonius University, Pécs, Hungary.

James, C. (1998). Errors in language learning and use: Exploring error analysis. London: Addison Wesley Longman.

Kukich, K. (1992). Techniques for automatically correcting words in text. ACM Computing Surveys, 24(4), 377-439.

Lee, J., & Seneff, S. (2008). Correcting misuse of verb forms. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (pp. 174-182). Columbus, OH: Association for Computational Linguistics.

Marcus, M., Kim, G., Marcinkiewicz, M. A., MacIntyre, R., Bies, A., Ferguson, M., et al. (1994). The Penn treebank: Annotating predicate argument structure. In C. J. Weinstein (Ed.), Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey [the 1994 ARPA Human Language Technology Workshop] (pp. 114-119). Princeton, NJ: Morgan Kaufmann.

Maxwell, J., & Kaplan, R. (1996). Unification-based parsers that automatically take advantage of context freeness. In M. Butt & T. H. King (Eds.), Proceedings of the First International Conference on Lexical Functional Grammar. Grenoble, France. Retrieved May 4, 2009, from http://www.parc.com/research/publications/details.php?id=3115

Okanohara, D., & Tsujii, J. (2007). A discriminative language model with pseudo-negative samples. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (pp. 73-80). Prague, Czech Republic: Association for Computational Linguistics.

Pelcra: Polish and English language corpora for research and applications. (2004). Retrieved November 9, 2004, from http://pelcra.ia.uni.lodz.pl

Sampson, G., & Babarczy, A. (2003). A test of the leaf-ancestor metric for parse accuracy. Natural Language Engineering, 9(4), 365-380.

Sjöbergh, J. (2006). Chunking: An unsupervised method to find errors in text. In S. Werner (Ed.), Proceedings of the 15th Nodalida Conference, Joensuu 2005 (pp. 180-185). Joensuu, Finland: University of Joensuu electronic publications in linguistics and language technology.

Smith, N. A., & Eisner, J. (2005a). Contrastive estimation: Training log-linear models on unlabeled data. In Proceedings of the 43rd Annual Meeting of the Association of Computational Linguistics (pp.354-362). Ann Arbor, Michigan: Association for Computational Linguistics.

Smith, N. A., & Eisner, J. (2005b). Guiding unsupervised grammar induction using contrastive estimation. In C. de la Higuera, T. Oates, G. Paliouras, & M. van Zaanen (Eds.), Proceedings of the IJCAI workshop on Grammatical Inference Applications (pp. 73-82). Edinburgh, Scotland.

Snow, C., & Meijer, G. (1976). On the secondary nature of syntactic intuitions. In S. Greenbaum (Ed.), Acceptability in language (pp. 163-177). The Hague: Mouton.

Sun, G., Liu, X., Cong, G., Zhou, M., Xiong, Z., Lee, J., et al. (2007). Detecting erroneous sentences using automatically mined sequential patterns. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (pp. 81-88). Prague, Czech Republic: Association for Computational Linguistics.

Tetreault, J., & Chodorow, M. (2008a). Native judgments of non-native usage: Experiments in preposition error detection. In Proceedings of the Coling 2008 Workshop on Human Judgments in Computational Linguistics (pp. 24-32). Manchester, UK: Coling 2008 Organizing Committee.

Tetreault, J. R., & Chodorow, M. (2008b). The ups and downs of prepositions. In Proceedings of COLING (pp. 865-872). Manchester, UK: Association for Computational Linguistics.

Verberne, S. (2002). Context-sensitive spell checking based on word trigram probabilities. Unpublished master’s thesis, University of Nijmegen.

Wagner, J., Foster, J., & Genabith, J. van. (2007). A comparative evaluation of deep and shallow approaches to the automatic detection of common grammatical errors. In Proceedings of the Joint International Conference on Empirical Methods in Natural Language Processing (EMNLP) and Natural Language Learning (CoNLL) (pp. 112-121). Prague, Czech Republic: Association for Computational Linguistics.

Witten, I. H., & Frank, E. (2000). Data mining: Practical machine learning tools and techniques with java implementations. San Mateo, CA: Morgan Kaufmann.

Judging Grammaticality

Experiments in Sentence Classification

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Subscription

Information

Accessibility

Unsubscribe

Latest publications