Type

Conference Proceedings

Authors

Andy Way
Yanjun Ma
Simon Petitrenaud
Patrik Lambert

Subjects

Linguistics

Topics
statistical machine translation phrase based smt phrase based statistical machine translation statistical analysis systematic study machine translating word alignment european parliament

Statistical analysis of alignment characteristics for phrase-based machine translation (2010)

Abstract In most statistical machine translation (SMT) systems, bilingual segments are extracted via word alignment. However, there lacks systematic study as to what alignment characteristics can benefit MT under specific experimental settings such as the language pair or the corpus size. In this paper we produce a set of alignments by directly tuning the alignment model according to alignment F-score and BLEU score in order to investigate the alignment characteristics that are helpful in translation. We report results for a phrasebased SMT system on Chinese-to-English IWSLT data, and Spanish-to-English European Parliament data. With a statistical analysis into alignment characteristics that are correlated with BLEU score, we give alignment hints to improve BLEU score using a phrase-based SMT system and different types of corpus.
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> Subject = Computer Science
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: Centre for Next Generation Localisation (CNGL)
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Machine translating
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres

Full list of authors on original publication

Andy Way, Yanjun Ma, Simon Petitrenaud, Patrik Lambert

Experts in our system

1
Andy Way
Dublin City University
Total Publications: 229