Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 151
Reward Inside the Model: A Lightweight Hidden-State Reward Model for LLM's Best-of-N sampling Paper • 2505.12225 • Published May 18, 2025 • 9
EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published Sep 26, 2025 • 135
Function Calling v3 Collection Models fine-tuned for function-calling • 14 items • Updated Apr 27, 2024 • 21