Chess-Bot-3000 250M
A 250M parameter language model trained from scratch on chess games in UCI notation. The model learns to predict chess moves from game context and can adapt its play style based on player Elo ratings.
Disclaimer: This model card is mostly LLM generated and may contain mistakes or missing information.
Model Details
Model Description
This model is a transformer-based language model trained on millions of chess games from the Lichess database. It uses UCI (Universal Chess Interface) notation and includes special tokens for player Elo ratings and game outcomes, allowing it to generate moves appropriate for different skill levels.
- Developed by: David Hauser (https://github.com/kinggongzilla)
- Model type: Causal language model (decoder-only transformer)
- Language(s): Chess UCI notation
- License: Apache 2.0
- Architecture: Qwen2-style (SmolLM3 base)
- Parameters: ~250M
Model Sources
- Repository: https://github.com/kinggongzilla/chess-bot-3000
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("daavidhauser/chess-bot-3000-100m")
tokenizer = AutoTokenizer.from_pretrained("daavidhauser/chess-bot-3000-100m")
prompt = "<BOG> <WHITE:1500> <BLACK:1600> <BLACK_WIN> e2e4"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_new_tokens=1)
print(tokenizer.decode(outputs[0]))
Note:
- By passing the first move "e2e4", the model will generate the next move for black
- By setting
<BLACK_WIN>the model is being conditioned to predict moves where black wins - ELO level tokens for white and black will induce the model to play at the given ELO level.
Evaluation
250m model performance against stockfish at various ELO levels

Uses
Direct Use
The model can be used for:
- Chess move prediction and game continuation
- Generating chess games at specific skill levels (by conditioning on Elo tokens)
- Chess position evaluation through next-move probabilities
- Chess education and analysis tools
Training Details
Training Data
The model was trained on approximately 100 million chess games from the Lichess open database (January 2024). Games were converted to UCI notation and augmented with:
- Player Elo ratings (rounded to nearest 100, range 0-3500)
- Game outcomes (
<WHITE_WIN>,<BLACK_WIN>,<DRAW>) - Special tokens for game boundaries
Each training example follows the format:
<BOG> <WHITE:1500> <BLACK:1500> <DRAW> ... <EOG>
In this representation, each chess move (half-move) corresponds to one token.
Training Infrastructure:
- Framework: Nanotron (PyTorch)
- Hardware: 1x NVIDIA A100 GPU
- Total training time: ~5 hours
- Downloads last month
- 44
