Quantifying Metric and Model Agreement in Bias Evaluation of Large Language Models

Published in The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026

This paper presents a comprehensive study quantifying the agreement across various bias evaluation metrics and Large Language Models (LLMs), aiming to provide a more robust understanding of fairness assessments in natural language processing. By systematically comparing how different models and metrics align (or diverge) when evaluating demographic biases, this work highlights critical considerations for ensuring reliable and consistent fairness benchmarks.

Recommended citation: Asgari, A., Wu, H., Naziri, A., Kolahdouzi, M., & Seyyed-Kalantari, L. (2026). "Quantifying Metric and Model Agreement in Bias Evaluation of Large Language Models." The 64th Annual Meeting of the Association for Computational Linguistics.
Download Paper

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Arash

Share on