Vimal Thilak's picture

3

Vimal Thilak

vimalthilak

AggieInCA

AI & ML interests

None yet

Organizations

None yet

authored 4 papers 3 months ago

Vanishing Gradients in Reinforcement Finetuning of Language Models

Paper • 2310.20703 • Published Oct 31, 2023 • 1

LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

Paper • 2312.04000 • Published Dec 7, 2023

How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

Paper • 2407.03475 • Published Jul 3, 2024

Rethinking JEPA: Compute-Efficient Video SSL with Frozen Teachers

Paper • 2509.24317 • Published Sep 29 • 10

authored a paper 11 months ago

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21 • 11