Deep Learning

GPT-3: Language Models are Few-Shot Learners

This talk summarizes the paper [`Language Models are Few-Shot Learners`](https://arxiv.org/abs/2005.14165).

AugVic: Exploiting BiText Vicinity for Low-Resource NMT

We propose AugVic, a data augmentation framework for sequence to sequence model (i.e. NMT) using Language Model.

UXLA: A Robust Unsupervised Data Augmentation Framework for Zero-Resouce Cross-Lingual NLP

We propose UXLA, a novel data augmentation framework for self-supervised learning in zero-resource transfer learning scenarios.

LNMAP: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space

We propose a novel semi-supervised method to learn cross-lingual word embeddings for BLI.

mBART: pretraining seq2seq architecture

This talk summarizes the paper [`mBART`](https://arxiv.org/abs/2001.08210) and some pretraining concepts.

Zero-Resource Cross-Lingual Named Entity Recognition

We propose a superior model and training method for zero resource transfer of Cross-lingual Named Entity Recognition.

Semi-Supervised Training

This talk summarizes the paper [`mixup`](https://arxiv.org/abs/1710.09412), [`MixMatch`](https://arxiv.org/abs/1905.02249), [`DivideMix`](https://openreview.net/pdf?id=HJgExaVtwr), [`UDA`](https://arxiv.org/abs/1904.12848).

Transforme-XL

This talk summarizes the paper [`Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context`](https://arxiv.org/abs/1901.02860). It assumes that audience are already familier with [`Attention Is All You Need`](https://arxiv.org/abs/1706.03762) paper and also discuss some high level concepts of it.

Malay-English Neural Machine Translation System.

This is a tool to translate an English sentence into Malay and vice versa.

Iterated Dilated Convolutions for NLP - NER as an example

This talk summarizes the paper [`Fast and Accurate Entity Recognition with Iterated Dilated Convolutions`](https://arxiv.org/abs/1702.02098).