Language Relatedness and Lexical Closeness can help Improve Multilingual NMT: IITBombay@MultiIndicNMT WAT2021

Jyotsana Khatri, Nikhil Saini, Pushpak Bhattacharyya

June 2021

Abstract

Multilingual Neural Machine Translation hasachieved remarkable performance by traininga single translation model for multiple languages. This paper describes our submission(Team ID- CFILT-IITB) for the MultiIndicMT:An Indic Language Multilingual Task at WAT2021. We train multilingual NMT systems by sharing encoder and decoder parameters with language embedding associated with each token in both encoder and decoder. Furthermore, we demonstrate the use of transliteration (script conversion) for Indic languages in reducing the lexical gap for training a multilingual NMT system. Further, we show improvement in performance by training a multilingualNMT system using languages of the same family, i.e., related languages.

Type

Conference paper

Publication

To appear in Proceedings of the 8th Workshop on Asian Translation

This domain is not registered with Commento.

Language Relatedness and Lexical Closeness can help Improve Multilingual NMT: IITBombay@MultiIndicNMT WAT2021

Abstract

Nikhil Saini

Lead Engineer

Related