Alignment Makes Language Models Normative, Not Descriptive Paper • 2603.17218 • Published 2 days ago • 32
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 9 days ago • 70
view article Article IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST 29 days ago • 18
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published Jan 29 • 155
Alterbute: Editing Intrinsic Attributes of Objects in Images Paper • 2601.10714 • Published Jan 15 • 31
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 235
Story2Board: A Training-Free Approach for Expressive Storyboard Generation Paper • 2508.09983 • Published Aug 13, 2025 • 70
CLEAR: Error Analysis via LLM-as-a-Judge Made Easy Paper • 2507.18392 • Published Jul 24, 2025 • 20
Effective Red-Teaming of Policy-Adherent Agents Paper • 2506.09600 • Published Jun 11, 2025 • 39
Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games Paper • 2506.05309 • Published Jun 5, 2025 • 16
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning Paper • 2505.17813 • Published May 23, 2025 • 58
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning Paper • 2505.10320 • Published May 15, 2025 • 24
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models Paper • 2505.02847 • Published May 1, 2025 • 30