Thomas Betton

tbetton

thomasbtnfr

AI & ML interests

None yet

Recent Activity

upvoted an article 8 days ago

Mixture of Experts Explained

upvoted an article 20 days ago

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

liked a Space about 1 month ago

OpenEvals/evaluation-guidebook

View all activity

Organizations

upvoted an article 8 days ago

Article

Mixture of Experts Explained

Dec 11, 2023

•

1.05k

upvoted an article 20 days ago

Article

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

Feb 28, 2025

•

upvoted an article about 1 month ago

Article

Improving Prompt Consistency with Structured Generations

Apr 30, 2024

•

upvoted 2 articles 7 months ago

Article

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

Jun 3, 2025

•

Article

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

Jul 1, 2025

•

132

upvoted a paper 9 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189

upvoted 2 articles 12 months ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

Feb 4, 2025

•

Article

4D masks support in Transformers

Jan 8, 2024

•

upvoted an article about 1 year ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

upvoted a paper about 1 year ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 47

Thomas Betton

AI & ML interests

Recent Activity

Organizations

tbetton's activity

Mixture of Experts Explained

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

Improving Prompt Consistency with Structured Generations

No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

4D masks support in Transformers

Efficient LLM Pretraining: Packed Sequences and Masked Attention