We investigate the question of what language model to train given a fixed compute budget of one million GPU hours. We explore scaling laws for training LLMs under constrained resources, providing practical guidance for large-scale model training decisions.