pretraining

ALLaM: Large Language Models for Arabic and English

A sovereign bilingual LLM for Arabic and English, achieving state-of-the-art Arabic performance, integrated into IBM Watsonx and Microsoft Azure.

What Language Model to Train if You Have One Million GPU Hours?

Investigating scaling laws and practical guidance for training language models under constrained compute budgets.