DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published 4 days ago • 19
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 102
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published Dec 1, 2025 • 102