-
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning
Paper • 2509.08755 • Published • 56 -
The Majority is not always right: RL training for solution aggregation
Paper • 2509.06870 • Published • 16 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 101 -
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
Paper • 2509.03646 • Published • 30
kongqi
kongqi
·
AI & ML interests
None yet
Recent Activity
liked
a Space
about 2 months ago
Salesforce/GIFT-Eval
updated
a collection
3 months ago
Llm
updated
a collection
3 months ago
Llm
Organizations
None yet