Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2603.27771

Read Later 📚

Interesting papers on AI, LLMs, etc. to add to reading list

Monitored Markov Decision Processes

Paper • 2402.06819 • Published Feb 9, 2024
Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Paper • 2505.08988 • Published May 13, 2025
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 76
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 109
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning

Paper • 2502.06060 • Published Feb 9, 2025 • 38
MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20, 2025 • 195
SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published Feb 20, 2025 • 100

local laptop dell g16 8 vRAM

unsloth/NVIDIA-Nemotron-3-Nano-4B-GGUF

Text Generation • 4B • Updated 19 days ago • 27.3k • 51
On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models

Paper • 2603.27481 • Published 7 days ago • 35
Emergent Social Intelligence Risks in Generative Multi-Agent Systems

Paper • 2603.27771 • Published 6 days ago • 50

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 36
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 50
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 47

Read Later 📚

Interesting papers on AI, LLMs, etc. to add to reading list

Monitored Markov Decision Processes

Paper • 2402.06819 • Published Feb 9, 2024
Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Paper • 2505.08988 • Published May 13, 2025
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

local laptop dell g16 8 vRAM

unsloth/NVIDIA-Nemotron-3-Nano-4B-GGUF

Text Generation • 4B • Updated 19 days ago • 27.3k • 51
On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models

Paper • 2603.27481 • Published 7 days ago • 35
Emergent Social Intelligence Risks in Generative Multi-Agent Systems

Paper • 2603.27771 • Published 6 days ago • 50

Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward

Paper • 2510.03222 • Published Oct 3, 2025 • 76
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 109
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
Multi-Agent Tool-Integrated Policy Optimization

Paper • 2510.04678 • Published Oct 6, 2025 • 31

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 36
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 50
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 47

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning

Paper • 2502.06060 • Published Feb 9, 2025 • 38
MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20, 2025 • 195
SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published Feb 20, 2025 • 100

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs