nezubn (Ankit Sharma)

upvoted an article 5 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8

•

735

upvoted an article 8 months ago

Article

Cohere on Hugging Face Inference Providers 🔥

+5

Apr 16

•

129

upvoted an article 9 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7

•

253

upvoted 2 papers about 1 year ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 49

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Paper • 2410.19008 • Published Oct 21, 2024 • 26

upvoted 3 articles over 1 year ago

Article

Optimizing your LLM in production

Sep 15, 2023

•

22

Article

Getting Started With Embeddings

Jun 23, 2022

•

97

Article

Quanto: a PyTorch quantization backend for Optimum

+1

Mar 18, 2024

•

45

upvoted 3 papers over 1 year ago

upvoted an article over 1 year ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

+3

May 24, 2023

•

171

upvoted 8 papers over 1 year ago

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 65

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107

Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2, 2024 • 46

Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Paper • 2403.20041 • Published Mar 29, 2024 • 34

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111

Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Paper • 2403.18795 • Published Mar 27, 2024 • 20

Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27, 2024 • 26

The case for 4-bit precision: k-bit Inference Scaling Laws

Paper • 2212.09720 • Published Dec 19, 2022 • 3

Ankit Sharma

AI & ML interests

Organizations

SmolLM3: smol, multilingual, long-context reasoner

Cohere on Hugging Face Inference Providers 🔥

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Cut Your Losses in Large-Vocabulary Language Models

Teach Multimodal LLMs to Comprehend Electrocardiographic Images

Optimizing your LLM in production

Getting Started With Embeddings

Quanto: a PyTorch quantization backend for Optimum

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

LoRA Learns Less and Forgets Less

Make Your LLM Fully Utilize the Context

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

Understanding LLMs: A Comprehensive Overview from Training to Inference

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Advancing LLM Reasoning Generalists with Preference Trees

Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs

Jamba: A Hybrid Transformer-Mamba Language Model

Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

Long-form factuality in large language models

The case for 4-bit precision: k-bit Inference Scaling Laws

Ankit Sharma

AI & ML interests

Organizations

nezubn's activity

SmolLM3: smol, multilingual, long-context reasoner

Cohere on Hugging Face Inference Providers 🔥

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Optimizing your LLM in production

Getting Started With Embeddings

Quanto: a PyTorch quantization backend for Optimum

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA