SBMARUF
Home
Experience
Projects
Talk
Publications
CV
News
Blog
Awards
compute-efficiency
What Language Model to Train if You Have One Million GPU Hours?
Investigating scaling laws and practical guidance for training language models under constrained compute budgets.
Cite
×