The Nature of Automated Essay Scoring Feedback

Authors

  • Semire Dikli

DOI:

https://doi.org/10.11139/cj.28.1.99-134

Keywords:

Language, Technology, Writing, Feedback

Abstract

The purpose of this study is to explore the nature of feedback that English as a Second Language (ESL) students received on their writings either from an automated essay scoring (AES) system or from the teacher. The participants were 12 adult ESL students who were attending an intensive English center at a university in Florida. The drafts of the students were analyzed in depth from a case-study perspective. While the document (essay) analysis was the main data collection method, observations and interviews provided crucial information regarding the context in which the students wrote and the nature of each type of feedback they received. The results revealed that the nature of the AES feedback and written teacher feedback (TF) feedback was different from each other. While the written TF was shorter and more focused, the AES feedback was quite long, generic, and redundant. The findings suggested that AES systems are not entirely ready to meet the needs of ESL or English as a Foreign Language (EFL) students. The developing companies need to improve the feedback capabilities of the program for nonnative English-speaking students, that is, less redundancy, shorter feedback, simpler language for feedback, and feedback for short/off topic/repetitious essays.

References

Ashwell, T. (2000). Patterns of teacher response to student writing in a multiple-draft composition classroom: Is content feedback followed by form feedback the best method? Journal of Second Language Writing, 9, 227-257.

Attali, Y. (2004, April). Exploring the feedback and revision features of Criterion. Paper presented at the National Council on Measurement in Education (NCME), San Diego, CA.

Attali, Y., & Burstein, J. (2006). An evaluation of IntelliMetric essay scoring system. Journal of Technology, Learning, and Assessment, 4(4). Retrieved From http://escholarship.bc.edu/jtla/vol4/3

Burstein, J. (2003). The e-rater scoring engine: Automated essay scoring with natural language processing. In M. D. Shermis & J. C. Burstein (Eds.), Automated essay scoring: A cross disciplinary approach (pp. 113-121). Mahwah, NJ: Lawrence Erlbaum Associates.

Burstein, J., & Chodorow, M. (1999, June). Automated essay scoring for nonnative English speakers. In Proceedings of the ACL99 Workshop on Computer-Mediated Language Assessment and Evaluation of Natural Language Processing. College Park, MD.

Chandler, J. (2003). The efficacy of various kinds of error feedback for improvement in the accuracy and fluency of L2 student writing. Journal of Second Language Writing, (12), 267-296.

Chen, C. E., & Cheng, W. E. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12(2), 94-112. Retrieved from http://llt.msu.edu/vol12num2/default.html

Chodorow, M., & Burstein, J. (2004). Beyond essay length: Evaluating e-rater’s performance on TOEFL essays (Research report no 73). Princeton, NJ: Educational Testing Service (ETS).

Chung, K. W. K., & O’Neil, H. F. (1997). Methodological approaches to online scoring of essays. (ERIC Document Reproduction Service No. ED418101)

Chung, G. K. W. K., & Baker, E. L. (2003). Issues in the reliability and validity of automated scoring of constructed responses. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 23-40). Mahwah, NJ: Lawrence Erlbaum Associates.

Cizek, G. J., & Page, B. A. (2003). The concept of reliability in the context of automated essay scoring. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 125-145). Mahwah, NJ: Lawrence Erlbaum Associates.

Dikli, S. (2006). An overview of automated scoring of essays. Journal of Technology, Learning, and Assessment, 5(1), n.p. Retrieved http://escholarship.bc.edu/jtla/vol5/1

Edelblut, P., & Vantage Learning. (2003, November). An analysis of the reliability of computer automated essay scoring by IntelliMetric of essays written in Malay language. Paper presented at TechEX 03, Ruamrudee International School, Bangkok, Thailand.

Edelblut, P., & Vantage Learning. (2005, March). The effects of inclusion of native speakers’ writing samples on the domain scoring accuracy of automated essay scoring of writing submitted by Taiwanese English language learners. Paper presented at The 32nd Annual Conference of the International Association for Educational Assessment, Singapore.

Elliott, S. (2003). IntelliMetric: From here to validity. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 125-145). Mahwah, NJ: Lawrence Erlbaum Associates.

Elliot, S., & Mikulas, C. (2004, April). A summary of studies demonstrating the educational impact of the MY Access online writing instructional application. Paper presented at the National Council on Measurement in Education (NCME), San Diego, CA.

Fathman, A. K., & Whalley, E. (1990). Teacher response to student writing: Focus on form versus content. In B. Kroll (Ed.), Second language writing: Research insights for the classroom (pp. 178-190). Cambridge, UK: Cambridge University Press.

Ferris, D., & Roberts, B. (2001). Error feedback in L2 writing classes: How explicit does it need to be? Journal of Second Language Writing, 10, 161-184.

Grimes, D., & Warschauer, M. (2006, April). Automated essay scoring in the classroom. Paper presented at the Annual Meeting of the American Educational Research Association (AERA), San Francisco, CA.

Hamp-Lyons, L. (2001). Fourth generation writing assessment. In T. Silva & P. K. Matsuda (Eds.), On second language writing (pp. 117-125). Mahwah, NJ: Lawrence Erlbaum Associates.

Hyland, F. (1998). The impact of teacher written feedback on individual writers. Journal of Second Language Writing, 7(3), 255-286.

Keith, T. M. (2003). Validity of automated essay scoring systems. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 147-166). Mahwah, NJ: Lawrence Erlbaum Associates.

Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to latent semantic analysis. Discourse Processes, 25, 259-284. Retrieved from http://lsa.colorado.edu/whatis.html

Landauer, T. K., Laham, D., & Foltz, P. W. (2003). Automated essay scoring: A cross disciplinary perspective. In M. D. Shermis & J. C. Burstein (Eds.), Automated essay scoring and annotation of essays with the Intelligent Essay Assessor (pp. 87-112). Mahwah, NJ: Lawrence Erlbaum Associates.

Landauer, T. K., Laham, D., Rehder, B., & Schreiner, M. E. (1997). How well can passage meaning be derived without using word order? A comparison of Latent Semantic Analysis and humans. In M.G. Shafto & P. Langley (Eds.), Proceedings of the 19th Annual Conference of the Cognitive Science Society (pp. 412-417). Mahwah, NJ: Lawrence Erlbaum Associates.

Nichols, P. D. (2004, April). Evidence for the interpretation and use of scores from an automated essay scorer. Paper presented at the Annual Meeting of the American Educational Research Association (AERA), San Diego, CA.

Page, E. B. (2003). Project essay grade: PEG. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 43-54). Mahwah, NJ: Lawrence Erlbaum Associates.

Rudner, L. M., Garcia, V., & Welch, C. (2006). An evaluation of IntelliMetric essay scoring system. Journal of Technology, Learning, and Assessment, 4(4). Retrieved from http://escholarship.bc.edu/jtla/vol4/4

Rudner, L. M., & Liang, T. (2002). Automated essay scoring using Bayes’ theorem. The Journal of Technology, Learning, and Assessment, 1(2). Retrieved from http://escholarship.bc.edu/jtla/vol1/2

Shermis, M. D., & Burstein, J. (2003). Automated essay scoring: A cross disciplinary perspective. Mahwah, NJ: Lawrence Erlbaum Associates.

Siegert, K. O., & Guo, F. (2009). Assessing the reliability of GMAT analytical writing assessment (RR-09-02). Reston, VA: Graduate Management Admissions Council (GMAC). Vantage Learning. (n.d.). Retrieved from http://www.vantagelearning.com

Vantage Learning. (2000a). A study of expert scoring and IntelliMetric scoring accuracy for dimensional scoring of Grade 11 student writing responses (RB-397). Newtown, PA: Author.

Vantage Learning. (2000b). A true score study of IntelliMetric accuracy for holistic and dimensional scoring of college entry-level writing program (RB-407). Newtown, PA: Vantage Learning.

Vantage Learning. (2001). A preliminary study of the efficacy of IntelliMetric for use in scoring Hebrew assessments (RB-561). Newtown, PA: Author.

Vantage Learning. (2002). A study of expert scoring, standard human scoring and IntelliMetric scoring accuracy for statewide eighth grade writing responses (RB-726). Newtown, PA: Author.

Vantage Learning. (2003a). A true score study of 11th grade student writing responses using IntelliMetric Version 9.0 (RB-786). Newtown, PA: Author.

Vantage Learning. (2003b). Assessing the accuracy of IntelliMetric for scoring a districtwide writing assessment (RB-806). Newtown, PA: Author.

Vantage Learning. (2003c). How does IntelliMetric score essay responses? (RB-929). Newtown, PA: Author.

Vantage Learning. (2004). A practitioner’s guide to scientifically-based MY Access research (RB- 931). Newtown, PA: Author.

What is artificial intelligence? (n.d.). Retrieved from http://www-formal.standford.edu/jmc/whatisai.html

Warschauer, M., & Ware, P. (2006). Automated writing evaluation: Defining the classroom research agenda. Language Teaching Research, 10(2), 1–24. Retrieved from http://ltr.sagepub.com/content/10/2/157.full.pdf+html

Downloads

Published

2013-01-14

Issue

Section

Articles

How to Cite

Dikli, S. (2013). The Nature of Automated Essay Scoring Feedback. CALICO Journal, 28(1), 99-134. https://doi.org/10.11139/cj.28.1.99-134

Most read articles by the same author(s)

1 2 3 4 5 6 7 8 9 10 > >>