Finetuned Language Models Are Zero-Shot Learners.
Language Models are Few-Shot Learners.
mBART and some pretraining concepts.
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. It assumes that audience are …
Fast and Accurate Entity Recognition with Iterated Dilated Convolutions.
Effective Approaches to Attention-based Neural Machine Translation.
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation.