mBART: pretraining seq2seq architecture


Date
Location
MICL Lab, Singapore

This talk goes though the pretraining objectives of seq2seq architecture. It also discusses, how mBART is different from pretraining of XLM and it’s derivatives?

Avatar
M Saiful Bari
Senior Research Scientist

@NTU, Singapore, Intern’20,21,22 Amazon Web Inc. (@awscloud), T0, BLOOMZ, UXLA, xCodeEval, I train LLM at SDAIA! - Scaling Maximalist, Training lead and Core maintainer of ALLaM.