CaRL: Learning Scalable Planning Policies with Simple Rewards Paper • 2504.17838 • Published Apr 24, 2025 • 4
AdaptThink: Reasoning Models Can Learn When to Think Paper • 2505.13417 • Published May 19, 2025 • 83