Type

Conference Proceedings

Authors

Fred Hollowood
Rasul Samad Zadeh Kaljahi
Johann Roturier
Joachim Wagner
Jennifer Foster
Raphael Rubino

Subjects

Linguistics

Topics
computational linguistics translation quality machine translating machine learning classification techniques dublin city quality estimation source language ranking of machine translation output

DCU-Symantec submission for the WMT 2012 quality estimation task (2012)

Abstract This paper describes the features and the machine learning methods used by Dublin City University (DCU) and SYMANTEC for the WMT 2012 quality estimation task. Two sets of features are proposed: one constrained, i.e. respecting the data limitation suggested by the workshop organisers, and one unconstrained, i.e. using data or tools trained on data that was not provided by the workshop organisers. In total, more than 300 features were extracted and used to train classifiers in order to predict the translation quality of unseen data. In this paper, we focus on a subset of our feature set that we consider to be relatively novel: features based on a topic model built using the Latent Dirichlet Allocation approach, and features based on source and target language syntax extracted using part-of-speech (POS) taggers and parsers. We evaluate nine feature combinations using four classification-based and four regression-based machine learning techniques.
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland -> Dublin City University -> Subject = Computer Science
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: Centre for Next Generation Localisation (CNGL)
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Machine translating
Ireland -> Dublin City University -> Subject = Computer Science: Computational linguistics
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres

Full list of authors on original publication

Fred Hollowood, Rasul Samad Zadeh Kaljahi, Johann Roturier, Joachim Wagner, Jennifer Foster, Raphael Rubino

Experts in our system

1
Joachim Wagner
Dublin City University
Total Publications: 24
 
2
Jennifer Foster
Dublin City University
Total Publications: 53
 
3
Raphael Rubino
Dublin City University
Total Publications: 8