Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming Paper • 2501.18837 • Published Jan 31, 2025 • 10
vectara/hallucination_evaluation_model Text Classification • 0.1B • Updated Oct 20, 2025 • 167k • 338