moonshotai/Kimi-Linear-48B-A3B-Instruct Text Generation • 49B • Updated 23 days ago • 53.2k • 519
Ministral 3 Collection Mistral Ministral 3: new multimodal models in Base, Instruct, and Reasoning variants, available in 3B, 8B, and 14B sizes. • 36 items • Updated 16 days ago • 26
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published Oct 17, 2025 • 49
Running 3.63k The Ultra-Scale Playbook 🌌 3.63k The ultimate guide to training LLM on large GPU Clusters
Gemma 3-270m Collection Collection of models for Gemma 3-270m • 4 items • Updated 24 days ago • 21
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning Paper • 2507.14111 • Published Jul 18, 2025 • 23