ZeroSumEval: An Extensible Framework for Scaling LLM Evaluation with Inter-Model Competition

Hisham A. Alyahya, Haidar Khan, Yazeed Alnumay, Bulent Yener, M Saiful Bari

May 1, 2025

Abstract

We present ZeroSumEval, an extensible framework for scaling LLM evaluation through inter-model competition. The framework enables systematic comparison of language models through competitive evaluation protocols.

Type

Conference paper

Publication

Proceedings of the Systems and Demonstration of 63rd Annual Meeting of the Association for Computational Linguistics