HuggingFaceTB/smol-smoltalk
Viewer β’ Updated β’ 485k β’ 9.42k β’ 99
1.Base-training PreTraining on FineWeb-EDU dataset using nanochat framework
Mid-training: General instruction tuning on SmolTalk, MMLU, GSM8K, Spelling tasks
SFT (Supervised Fine-Tuning): Chat-specific training on ARC, GSM8K, SmolTalk
RL (Reinforcement Learning): Optional GRPO-style training on GSM8K (if included)
βββ tokenizer/
β βββ tokenizer.pkl # Tokenizer
β βββ token_bytes.pt # Token byte mappings
βββ base_checkpoints/d12/ # pre-training checkpoint
β βββ model_*.pt
β βββ meta_*.json
βββ mid_checkpoints/d12/ # Mid-training checkpoint
β βββ model_*.pt
β βββ meta_*.json
βββ chatsft_checkpoints/d12/ # SFT checkpoint
β βββ model_*.pt
β βββ meta_*.json
βββ chatsft_checkpoints_int8/d12/ # SFT checkpoint
β βββ model_*.pt
β βββ meta_*.json
βββ chatrl_checkpoints/d12/ # RL checkpoint (if available)
β βββ model_*.pt
β βββ meta_*.json
βββ logs/ # Training logs
MIT License (same as nanochat)
@misc{nanochat,
author = {Andrej Karpathy},
title = {nanochat: The best ChatGPT that $100 can buy},
year = {2025},
publisher = {GitHub},
url = {https://github.com/karpathy/nanochat}
}