The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30 • 115
MLE-Smith: Scaling MLE Tasks with Automated Multi-Agent Pipeline Paper • 2510.07307 • Published Oct 8 • 5
AceReason Collection Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 6 days ago • 19
UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios Paper • 2509.21766 • Published Sep 26 • 23
Language Models Can Learn from Verbal Feedback Without Scalar Rewards Paper • 2509.22638 • Published Sep 26 • 70
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust GAIA Problem Solving Paper • 2508.09889 • Published Aug 13 • 32
Beyond Ten Turns: Unlocking Long-Horizon Agentic Search with Large-Scale Asynchronous RL Paper • 2508.07976 • Published Aug 11 • 51
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published Jul 28 • 82
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16 • 42