CaRL: Learning Scalable Planning Policies with Simple Rewards Paper • 2504.17838 • Published Apr 24, 2025 • 4
Flow-GRPO: Training Flow Matching Models via Online RL Paper • 2505.05470 • Published May 8, 2025 • 86