The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Paper • 2510.08240 • Published Oct 9 • 41
The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks Paper • 2509.25671 • Published Sep 30 • 6
Jointly Reinforcing Diversity and Quality in Language Model Generations Paper • 2509.02534 • Published Sep 2 • 24
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning Paper • 2410.01044 • Published Oct 1, 2024 • 35
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 872