arxiv:2501.13074
Songhao Wu
shwu
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
10 days ago
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
upvoted
a
paper
2 months ago
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
upvoted
a
paper
7 months ago
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in
Learning to Reason
Organizations
None yet