Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains Paper โข 2510.17793 โข Published Oct 20, 2025 โข 2 โข 2