yoonkyumng's picture

5 1

yoonkyumng

yoonkg

AI & ML interests

None yet

Organizations

None yet

upvoted 5 papers 3 months ago

Diversity-Incentivized Exploration for Versatile Reasoning

Paper • 2509.26209 • Published Sep 30, 2025 • 16

It Takes Two: Your GRPO Is Secretly DPO

Paper • 2510.00977 • Published Oct 1, 2025 • 31

CE-GPPO: Controlling Entropy via Gradient-Preserving Clipping Policy Optimization in Reinforcement Learning

Paper • 2509.20712 • Published Sep 25, 2025 • 19

Video models are zero-shot learners and reasoners

Paper • 2509.20328 • Published Sep 24, 2025 • 99

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2, 2025 • 80