Type

Conference Proceedings

Authors

Josef VanGenabith
Deirdre Hogan
Joakim Nivre
Joseph Le Roux
Joachim Wagner
Ozlem Cetinoglu
Jennifer Foster

Subjects

Linguistics

Topics
twitter parsing user generated content web 2 0 domain adaptation up training computational linguistics self training

From news to comment: Resources and benchmarks for parsing the language of web 2.0 (2011)

Abstract We investigate the problem of parsing the noisy language of social media. We evaluate four all-Street-Journal-trained statistical parsers (Berkeley, Brown, Malt and MST) on a new dataset containing 1,000 phrase structure trees for sentences from microblogs (tweets) and discussion forum posts. We compare the four parsers on their ability to produce Stanford dependencies for these Web 2.0 sentences. We find that the parsers have a particular problem with tweets and that a substantial part of this problem is related to POS tagging accuracy. We attempt three retraining experiments involving Malt, Brown and an in-house Berkeley-style parser and obtain a statistically significant improvement for all three parsers.
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland -> Dublin City University -> Subject = Computer Science
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: Centre for Next Generation Localisation (CNGL)
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: National Centre for Language Technology (NCLT)
Ireland -> Dublin City University -> Subject = Computer Science: Computational linguistics
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres

Full list of authors on original publication

Josef VanGenabith, Deirdre Hogan, Joakim Nivre, Joseph Le Roux, Joachim Wagner, Ozlem Cetinoglu, Jennifer Foster

Experts in our system

1
Deirdre Hogan
Dublin City University
Total Publications: 14
 
2
Joachim Wagner
Dublin City University
Total Publications: 24
 
3
Ozlem Cetinoglu
Dublin City University
Total Publications: 10
 
4
Jennifer Foster
Dublin City University
Total Publications: 53