ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models Paper • 2512.07843 • Published Nov 24, 2025 • 22
view post Post 193 ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration (2511.21689) See translation 👀 1 1 + Reply
A PINN Approach to Symbolic Differential Operator Discovery with Sparse Data Paper • 2212.04630 • Published Dec 9, 2022
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17, 2025 • 98
Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published Jan 8, 2025 • 96
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published Nov 7, 2024 • 51
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 50
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation Paper • 2410.03960 • Published Oct 4, 2024 • 2
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models Paper • 2410.03290 • Published Oct 4, 2024 • 7
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 134
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts Paper • 2407.21770 • Published Jul 31, 2024 • 22
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Paper • 2406.18521 • Published Jun 26, 2024 • 31
LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models Paper • 2306.12420 • Published Jun 21, 2023 • 2