Type

Conference Proceedings

Authors

Josef van Genabith
Joachim Wagner
Jennifer Foster

Subjects

Linguistics

Topics
noise errors machine translating ungrammatical sentences treebank training parser form

Adapting a WSJ-trained parser to grammatically noisy text (2008)

Abstract We present a robust parser which is trained on a treebank of ungrammatical sentences. The treebank is created automatically by modifying Penn treebank sentences so that they contain one or more syntactic errors. We evaluate an existing Penn-treebank-trained parser on the ungrammatical treebank to see how it reacts to noise in the form of grammatical errors. We re-train this parser on the training section of the ungrammatical treebank, leading to an significantly improved performance on the ungrammatical test sets. We show how a classifier can be used to prevent performance degradation on the original grammatical data.
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> Subject = Computer Science
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Machine translating
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: National Centre for Language Technology (NCLT)
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres

Full list of authors on original publication

Josef van Genabith, Joachim Wagner, Jennifer Foster

Experts in our system

1
Josef van Genabith
Dublin City University
Total Publications: 114
 
2
Joachim Wagner
Dublin City University
Total Publications: 19
 
3
Jennifer Foster
Dublin City University
Total Publications: 41