Miaow-Lab 's Collections

RUT-Bench

Benchmark data in "Beyond Ideal Instruction: A Comprehensive Framework for Evaluating LLMs in Realistic Interactions".