Abstract:
For many ESL (English as a Second Language) and EFL (English as a Foreign Language) students, interacting with computerized applications is an integral part of their learning experience. NLP-based language models can be a valuable tool in assisting teachers and students alike by providing prompt feedback on certain aspects of language. Lately there have been efforts aimed at developing grammar correction applications designed specifically with learners of English in mind. A common approach shared by most of the previous work is the reliance on well-formed texts written by native English speakers to train a statistical model. This is mostly due to the fact that to date, constructing a large enough error-annotated corpus to support a statistical approach is time-consuming and labor intensive. In this talk, we present an alternative approach, namely that of training learner error correction models exclusively trained on a error-annotated corpus produced by ESL learners. We address the design issues and the logistic problem that springs from the partially annotated nature of our data set.
Short Bio:
Na-Rae Han is currently Lecturer in the Linguistics department and Director of Robert Henderson Language Media Center, which promotes use of technology in language instruction, at the University of Pittsburgh. She received her Ph.D. in Linguistics from the University of Pennsylvania in 2006 after completing her M.S.E. in Computer and Information Science there. She has previously worked as a researcher in the Automated Scoring and Natural Language Processing Group of Educational Testing Service (ETS) in Princeton, NJ, and as a research professor at Korea University in Seoul, Korea.