Aritra Roy Gosthipaty's picture

Building on HF

Aritra Roy Gosthipaty PRO

ariG23498

huggingface

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

updated a dataset about 10 hours ago

model-metadata/custom-code-models

updated a dataset about 10 hours ago

model-metadata/trending_models_metadata

updated a bucket about 12 hours ago

ariG23498/torch-traces

View all activity

Organizations

liked a model 4 days ago

google/gemma-4-E2B-it-assistant

Any-to-Any • 78M • Updated 3 days ago • 11.6k • 49

liked a Space 9 days ago

The ultimate guide to RL environments: building and scaling them in the LLM era

Building and scaling RL environments for LLM training

liked 6 models 14 days ago

ibm-granite/granite-vision-4.1-4b

Image-Text-to-Text • 4B • Updated 8 days ago • 23.8k • 73

nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16

Any-to-Any • 33B • Updated 6 days ago • 238k • 287

lewtun/talkie-1930-13b-it-hf

Text Generation • 13B • Updated 16 days ago • 6.7k • 23

RedHatAI/Qwen3.6-35B-A3B-NVFP4

Updated 24 days ago • 1.83M • 136

talkie-lm/talkie-1930-13b-it

Updated 21 days ago • 266

inclusionAI/LLaDA2.0-Uni

Any-to-Any • 16B • Updated 3 days ago • 2.08k • 244

liked a model 22 days ago

google/gemma-4-31B

Image-Text-to-Text • 33B • Updated Apr 2 • 359k • 371

liked a model 27 days ago

Qwen/Qwen3.6-35B-A3B

Image-Text-to-Text • 36B • Updated 21 days ago • 4.61M • • 1.76k

liked a Space 28 days ago

Transformers PR Dashboard

PR triage dashboard with cluster and signal analysis

liked 6 models about 1 month ago

google/gemma-4-E2B

Any-to-Any • 5B • Updated Apr 2 • 616k • 262

microsoft/harrier-oss-v1-0.6b

Feature Extraction • 0.6B • Updated Mar 30 • 203k • • 227

google/gemma-4-26B-A4B-it

Image-Text-to-Text • 27B • Updated 7 days ago • 7.73M • • 950

google/gemma-4-E4B-it

Any-to-Any • 8B • Updated 7 days ago • 5.96M • 994

google/gemma-4-31B-it

Image-Text-to-Text • 33B • Updated 7 days ago • 9.79M • • 2.63k

google/gemma-4-E2B-it

Any-to-Any • 5B • Updated 7 days ago • 3.48M • 611

liked a Space about 1 month ago

Distilling 100B+ Models 40x Faster with TRL

TRL distillation for 100B+ teachers, 40x faster

liked 2 models about 1 month ago

Qwen/Qwen3-VL-Embedding-2B

Sentence Similarity • 2B • Updated 28 days ago • 1.98M • 401

arcee-ai/Trinity-Large-Thinking

Text Generation • 399B • Updated 6 days ago • 21k • • 168