RL - a Michael1056 Collection

Michael1056 's Collections

AD

Others

RL

RL

updated May 27, 2025

Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities

Paper • 2505.15692 • Published May 21, 2025 • 14
s3: You Don't Need That Much Data to Train a Search Agent via RL

Paper • 2505.14146 • Published May 20, 2025 • 19