Malay-English Neural Machine Translation System.

Malay-English Neural Machine Translation System.

This is a tool to translate an English sentence into Malay and vice versa. Developing a translation tool for low-resource languages like Malay has always been a challenge. The main challenge comes from the fact that machine translation systems typically rely on a huge amount of sentence-parallel data, and creating such datasets is an expensive process. In our work, we collected parallel datasets from various sources including News, OpenSubtitiles (OPUS), Ted talks, and Youtube video. Therefore, our corpus is quite generic and covers both texts and conversations.

We used various state of the art deep Neural Machine Translation (NMT) architecture for training our model. More specifically we use both seq2seq and transformer-net architecture for finding our best model. For pre-processing and post-processing datasets we used various tools of moses. To train our model we used OpenNMT-py framework which is very standard in the NMT community for it’s robust and modular implementation.

Currently the live demo can only be accessed from inside NTU network.

Avatar
M Saiful Bari
Senior Research Scientist

@NTU, Singapore, Intern’20,21,22 Amazon Web Inc. (@awscloud), T0, BLOOMZ, UXLA, xCodeEval, I train LLM at SDAIA! - Scaling Maximalist, Training lead and Core maintainer of ALLaM.