LatentUMM: Dual Latent Alignment for Unified Multimodal Models Paper • 2605.17766 • Published May 18 • 9
Stress-Testing the Reasoning Competence of LLMs With Proofs Under Minimal Formalism Paper • 2605.12524 • Published Apr 7 • 4
TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation Paper • 2605.22355 • Published May 21 • 179
Evaluating Cognitive Age Alignment in Interactive AI Agents Paper • 2605.17894 • Published May 18 • 5
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published May 13 • 274
latency-sensitive-bench/deadly_corridor_jitter_latency_uniform_min_1_max_3 Viewer • Updated May 14 • 2.59k • 45 • 1