Renjie's picture

2 19 2

Renjie

Renjie-Ranger

·

https://renjie-ranger.github.io/

AI & ML interests

LLM Post-Training

Recent Activity

authored a paper 3 days ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

authored a paper 3 days ago

UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs

updated a collection 4 days ago

Feedback_Conditional_Policy

View all activity

Organizations

None yet

Collections 3

View 3 collections

Papers 4

arxiv:2509.22638

arxiv:2506.07712

arxiv:2404.07584

arxiv:2402.14008

models 498

Renjie-Ranger/RFT-GRPO_Qwen2.5-7B

8B • Updated 4 days ago • 4

Renjie-Ranger/Base-GRPO_Qwen2.5-7B

8B • Updated 4 days ago • 6

Renjie-Ranger/FCP-Bootstrap_Qwen2.5-7B

8B • Updated 4 days ago • 7

Renjie-Ranger/all_pairs_rft_Qwen25-7B

8B • Updated Nov 26, 2025 • 3

Renjie-Ranger/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_90

8B • Updated Nov 21, 2025 • 4

Renjie-Ranger/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_85

8B • Updated Nov 21, 2025 • 3

Renjie-Ranger/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_80

8B • Updated Nov 21, 2025 • 2

Renjie-Ranger/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_75

8B • Updated Nov 21, 2025 • 4

Renjie-Ranger/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_70

8B • Updated Nov 21, 2025 • 2

Renjie-Ranger/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_65

8B • Updated Nov 21, 2025 • 4

View 498 models

datasets 12

Renjie-Ranger/open_r1_math_all_sampled_128k

Viewer • Updated Nov 12, 2025 • 128k • 48

Renjie-Ranger/open_r1_math_all_sampled_64k

Viewer • Updated Nov 12, 2025 • 64k • 156

Renjie-Ranger/open_r1_math_all_sampled_32k

Viewer • Updated Nov 12, 2025 • 32k • 71

Renjie-Ranger/open_r1_math_all_sampled_16k

Viewer • Updated Nov 12, 2025 • 16k • 125

Renjie-Ranger/open_r1_math_all_sampled_8k

Viewer • Updated Nov 12, 2025 • 8k • 6

Renjie-Ranger/open_r1_math_curriculum_220k

Viewer • Updated Nov 12, 2025 • 220k • 11

Renjie-Ranger/FCP_big_math_pro_SFT

Viewer • Updated Sep 26, 2025 • 384k • 19 • 1

Renjie-Ranger/FCP_general_reasoner_pro_SFT

Viewer • Updated Sep 26, 2025 • 272k • 6

Renjie-Ranger/FCP_general_reasoner_pro_C-plus_no_concise

Viewer • Updated Sep 25, 2025 • 133k • 8

Renjie-Ranger/FCP_big_math_pro_C-plus_no_concise

Viewer • Updated Sep 25, 2025 • 185k • 10

View 12 datasets