DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published Dec 3, 2025 • 154
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 102
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 294
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29, 2025 • 223
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 38
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation Paper • 2509.25849 • Published Sep 30, 2025 • 48
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search Paper • 2509.07969 • Published Sep 9, 2025 • 59
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 149
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2, 2025 • 125
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 78
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84