Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence Paper • 2511.07384 • Published 27 days ago • 16
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22 • 35
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10 • 101
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18 • 19
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper • 2411.16489 • Published Nov 25, 2024 • 47
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Paper • 2411.13543 • Published Nov 20, 2024 • 19
RadGraph: Extracting Clinical Entities and Relations from Radiology Reports Paper • 2106.14463 • Published Jun 28, 2021
Don't be fooled: label leakage in explanation methods and the importance of their quantitative evaluation Paper • 2302.12893 • Published Feb 24, 2023
Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management Paper • 2108.01764 • Published Aug 3, 2021
Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14, 2024 • 20
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis Paper • 2410.02749 • Published Oct 3, 2024 • 13
BARTScore: Evaluating Generated Text as Text Generation Paper • 2106.11520 • Published Jun 22, 2021 • 2
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios Paper • 2307.13528 • Published Jul 25, 2023 • 1
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Paper • 2107.13586 • Published Jul 28, 2021