Type

Journal Article

Authors

Andy Way
Qun Lui
Peyman Passban

Subjects

Linguistics

Topics
adaptation bilingual corpus statistical machine translation vocabulary resource machine translating computational models languages

Translating low-resource languages by vocabulary adaptation from close counterparts (2017)

Abstract Some natural languages belong to the same family or share similar syntactic and/or semantic regularities. This property persuades researchers to share computational models across languages and benefit from high-quality models to boost existing low-performance counterparts. In this article, we follow a similar idea, whereby we develop statistical and neural machine translation (MT) engines that are trained on one language pair but are used to translate another language. First we train a reliable model for a high resource language, and then we exploit cross-lingual similarities and adapt the model to work for a close language with almost zero resources. We chose Turkish (Tr) and Azeri or Azerbaijani (Az) as the proposed pair in our experiments. Azeri suffers from lack of resources as there is almost no bilingual corpus for this language. Via our techniques, we are able to train an engine for the Az→English (En) direction, which is able to outperform all other existing models.
Collections Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: ADAPT
Ireland -> Dublin City University -> Publication Type = Article
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Machine translating

Full list of authors on original publication

Andy Way, Qun Lui, Peyman Passban

Experts in our system

1
Andy Way
Dublin City University
Total Publications: 229
 
2
Peyman Passban
Dublin City University
Total Publications: 9