Type

Conference Proceedings

Authors

Andy Way
Tony O’Dowd
Laura Casanellas
Marc Anthony Palminteri
Jinhua Du
Dimitar Shterionov

Subjects

Computer Science

Topics
technology machine translating training word alignment building implementation empirical evaluation statistical machine translation

Improving KantanMT training efficiency with fast align (2016)

Abstract In recent years, statistical machine translation (SMT) has been widely deployed in translators’ workflow with significant improvement of productivity. However, prior to invoking an SMT system to translate an unknown text, an SMT engine needs to be built. As such, building speed of the engine is essential for the translation workflow, i.e., the sooner an engine is built, the sooner it will be exploited. With the increase of the computational capabilities of recent technology the building time for an SMT engine has decreased substantially. For example, cloud-based SMT providers, such as KantanMT, can built high-quality, ready-to-use, custom SMT engines in less than a couple of days. To speed-up furthermore this process we look into optimizing the word alignment process that takes place during building the SMT engine. Namely, we substitute the word alignment tool used by KantanMT pipeline – Giza++ – with a more efficient one, i.e., fast_align. In this work we present the design and the implementation of the KantanMT pipeline that uses fast_align in place of Giza++. We also conduct a comparison between the two word alignment tools with industry data and report on our findings. Up to our knowledge, such extensive empirical evaluation of the two tools has not been done before.
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: ADAPT
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Machine translating

Full list of authors on original publication

Andy Way, Tony O’Dowd, Laura Casanellas, Marc Anthony Palminteri, Jinhua Du, Dimitar Shterionov

Experts in our system

1
Andy Way
Dublin City University
Total Publications: 229
 
2
Jinhua Du
Dublin City University
Total Publications: 38