LSTM cell

Transforme-XL

LSTM cell

Transforme-XL

Abstract

This talk summarizes the paper Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. It assumes that audience are already familier with Attention Is All You Need paper and also discuss some high level concepts of it.

Date
Location
MICL Lab, Singapore
Avatar
M Saiful Bari
Senior Research Scientist

@NTU, Singapore, Intern’20,21,22 Amazon Web Inc. (@awscloud), T0, BLOOMZ, UXLA, xCodeEval, I train LLM at SDAIA! - Scaling Maximalist, Training lead and Core maintainer of ALLaM.