Running 156 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 156 Building and scaling RL environments for LLM training
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 Any-to-Any • 33B • Updated 6 days ago • 238k • 287
Running Featured 81 Distilling 100B+ Models 40x Faster with TRL 📝 81 TRL distillation for 100B+ teachers, 40x faster