Deep Learning

GPT-3: Language Models are Few-Shot Learners

This talk summarizes the paper [`Language Models are Few-Shot Learners`](

AugVic: Exploiting BiText Vicinity for Low-Resource NMT

We propose AugVic, a data augmentation framework for sequence to sequence model (i.e. NMT) using Language Model.

UXLA: A Robust Unsupervised Data Augmentation Framework for Zero-Resouce Cross-Lingual NLP

We propose UXLA, a novel data augmentation framework for self-supervised learning in zero-resource transfer learning scenarios.

LNMAP: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space

We propose a novel semi-supervised method to learn cross-lingual word embeddings for BLI.

mBART: pretraining seq2seq architecture

This talk summarizes the paper [`mBART`]( and some pretraining concepts.

Zero-Resource Cross-Lingual Named Entity Recognition

We propose a superior model and training method for zero resource transfer of Cross-lingual Named Entity Recognition.

Semi-Supervised Training

This talk summarizes the paper [`mixup`](, [`MixMatch`](, [`DivideMix`](, [`UDA`](


This talk summarizes the paper [`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context`]( It assumes that audience are already familier with [`Attention Is All You Need`]( paper and also discuss some high level concepts of it.

Malay-English Neural Machine Translation System.

This is a tool to translate an English sentence into Malay and vice versa.

Iterated Dilated Convolutions for NLP - NER as an example

This talk summarizes the paper [`Fast and Accurate Entity Recognition with Iterated Dilated Convolutions`](