Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2509.07980

Reasoning Papers

about 3 hours ago

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Paper • 2508.07629 • Published Aug 11 • 42
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning

Paper • 2508.07101 • Published Aug 9 • 13
Compressing Chain-of-Thought in LLMs via Step Entropy

Paper • 2508.03346 • Published Aug 5 • 7
Train Long, Think Short: Curriculum Learning for Efficient Reasoning

Paper • 2508.08940 • Published Aug 12 • 27

reasoning_model

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published 18 days ago • 91
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30 • 3

reinforcement_learning_research

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Paper • 2506.14245 • Published Jun 17 • 44
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 276
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 240
A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 259

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56
The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Paper • 2509.03646 • Published Sep 3 • 30

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Tree Search for LLM Agent Reinforcement Learning

Paper • 2509.21240 • Published Sep 25 • 87

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Robot Learning from a Physical World Model

Paper • 2511.07416 • Published 27 days ago • 29
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Paper • 2511.06805 • Published 28 days ago • 12
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published 21 days ago • 118

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2 • 80
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10 • 98
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 116
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Paper • 2508.19828 • Published Aug 27 • 7

thinking-mechanisms

Collection of papers research in LLM or AI Agent's thinking mechanisms

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30 • 3
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Paper • 2505.17813 • Published May 23 • 57
Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21 • 88

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9 • 83
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 211
Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 193

Reasoning Papers

about 3 hours ago

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Paper • 2508.07629 • Published Aug 11 • 42
Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning

Paper • 2508.07101 • Published Aug 9 • 13
Compressing Chain-of-Thought in LLMs via Step Entropy

Paper • 2508.03346 • Published Aug 5 • 7
Train Long, Think Short: Curriculum Learning for Efficient Reasoning

Paper • 2508.08940 • Published Aug 12 • 27

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Tree Search for LLM Agent Reinforcement Learning

Paper • 2509.21240 • Published Sep 25 • 87

reasoning_model

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published 18 days ago • 91
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30 • 3

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Robot Learning from a Physical World Model

Paper • 2511.07416 • Published 27 days ago • 29
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning

Paper • 2511.06805 • Published 28 days ago • 12
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Paper • 2511.17592 • Published 21 days ago • 118

reinforcement_learning_research

Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Paper • 2506.14245 • Published Jun 17 • 44
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2 • 80
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10 • 98
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 116
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

Paper • 2508.19828 • Published Aug 27 • 7

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 276
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published Jul 1 • 240
A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 259

thinking-mechanisms

Collection of papers research in LLM or AI Agent's thinking mechanisms

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute

Paper • 2509.04475 • Published Aug 30 • 3
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Paper • 2505.17813 • Published May 23 • 57
Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21 • 88

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Paper • 2509.08755 • Published Sep 10 • 56
The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning

Paper • 2509.03646 • Published Sep 3 • 30

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9 • 83
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper • 2509.03867 • Published Sep 4 • 211
Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 193

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs