This talk summarizes the paper Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. It assumes that audience are already familier with Attention Is All You Need paper and also discuss some high level concepts of it.
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Attention Is All You Need
Computer Science enthusiast working in deep learning, natural language processing.