Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published 24 days ago • 34
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation Paper • 2601.00664 • Published 15 days ago • 52
Chaining the Evidence: Robust Reinforcement Learning for Deep Search Agents with Citation-Aware Rubric Rewards Paper • 2601.06021 • Published 8 days ago • 39
Qwen3-VL-Embedding and Qwen3-VL-Reranker: A Unified Framework for State-of-the-Art Multimodal Retrieval and Ranking Paper • 2601.04720 • Published 9 days ago • 43
User-Oriented Multi-Turn Dialogue Generation with Tool Use at scale Paper • 2601.08225 • Published 4 days ago • 46
AT^2PO: Agentic Turn-based Policy Optimization via Tree Search Paper • 2601.04767 • Published 9 days ago • 26
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice Paper • 2601.05175 • Published 9 days ago • 32
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 9 days ago • 191
Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding Paper • 2512.17220 • Published 29 days ago • 111
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 19 days ago • 94
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 17 days ago • 137
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space Paper • 2512.24617 • Published 17 days ago • 56
The Path Not Taken: RLVR Provably Learns Off the Principals Paper • 2511.08567 • Published Nov 11, 2025 • 33