Bowen Peng's picture

Bowen Peng

bloc97

·

bloc97

AI & ML interests

Machine Learning, Computer Graphics, Language Models

Recent Activity

liked a model 8 days ago

ideogram-ai/ideogram-4-fp8

upvoted a paper 15 days ago

JLT: Clean-Latent Prediction in Latent Diffusion Transformers

upvoted a paper 24 days ago

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

View all activity

Organizations

commented a paper about 1 month ago

Efficient Pre-Training with Token Superposition

Paper • 2605.06546 • Published May 7 • 46 •

New activity in PsycheFoundation/consilience-40b-7Y9v38s5 11 months ago

fixed typo in readme

#1 opened about 1 year ago by

New activity in NousResearch/Nous-Capybara-34B over 2 years ago

How did you train this without going OOM in RAM & VRAM?

#15 opened over 2 years ago by

New activity in NousResearch/Yarn-Mistral-7b-128k over 2 years ago

VRAM usage for full 128k tokens

#5 opened over 2 years ago by

sliding_window = 131072? Sliding window attention doesn't work for 128?

#4 opened over 2 years ago by

New activity in NousResearch/Yarn-Llama-2-13b-64k almost 3 years ago

Hardware requirements for the model.

#1 opened almost 3 years ago by