LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ Paper • 2409.16779 • Published Sep 25, 2024
Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News Paper • 2410.20198 • Published Oct 26, 2024
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17, 2025 • 14
ModernVBERT: Towards Smaller Visual Document Retrievers Paper • 2510.01149 • Published Oct 1, 2025 • 30
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17, 2025 • 14
Benchmarking Optimizers for Large Language Model Pretraining Paper • 2509.01440 • Published Sep 1, 2025 • 24
Towards Open Foundation Language Model and Corpus for Macedonian: A Low-Resource Language Paper • 2506.09560 • Published Jun 11, 2025
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26, 2025 • 75
MIRIAD: Augmenting LLMs with millions of medical query-response pairs Paper • 2506.06091 • Published Jun 6, 2025 • 9
Enhancing Inflation Nowcasting with LLM: Sentiment Analysis on News Paper • 2410.20198 • Published Oct 26, 2024
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ Paper • 2409.16779 • Published Sep 25, 2024
On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial Paper • 2403.14380 • Published Mar 21, 2024 • 1
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 19
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Paper • 2311.16079 • Published Nov 27, 2023 • 19
Evaluating the Search Phase of Neural Architecture Search Paper • 1902.08142 • Published Feb 21, 2019
Landmark Attention: Random-Access Infinite Context Length for Transformers Paper • 2305.16300 • Published May 25, 2023