language-model

Pathways to semi-(un)supervised* NLP Brain

This talk discusses the evolving field of transfer learning, from LSTMs to large language models, and shows new direction on the transferability in large language model.

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

Adding new language support to the BLOOM multilingual language model for zero-shot prompting.

xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

We introduce xCodeEval, the largest executable multilingual multitask benchmark to date consisting of 25M document-level coding examples from about 7.5 K unique problems covering up to 17 programming languages with execution-level parallelism.

Crosslingual Generalization through Multitask Finetuning

Shows multitask multilingual generalization in language model.

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning

A semi-parametric prompt tuning method improving multitask generalization for parameter-efficient fine-tuning with cross-task zero-shot generalization.

What Language Model to Train if You Have One Million GPU Hours?

Investigating scaling laws and practical guidance for training language models under constrained compute budgets.

What Language Model to Train if You Have One Million GPU Hours?

The crystallization of modeling methods around the Transformer architecture has been a boon for practitioners. Simple, well-motivated architectural variations that transfer across tasks and scale, increasing the impact of modeling research. However, …

language-model

Pathways to semi-(un)supervised* NLP Brain

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

Crosslingual Generalization through Multitask Finetuning

SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning

What Language Model to Train if You Have One Million GPU Hours?

What Language Model to Train if You Have One Million GPU Hours?

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Multitask Prompted Training Enables Zero-Shot Task Generalization

Nearest Neighbour Few-Shot Learning for Cross-lingual Classification