Chess-Bot-3000 250M

A 250M parameter language model trained from scratch on chess games in UCI notation. The model learns to predict chess moves from game context and can adapt its play style based on player Elo ratings.

Disclaimer: This model card is mostly LLM generated and may contain mistakes or missing information.

Model Details

Model Description

This model is a transformer-based language model trained on millions of chess games from the Lichess database. It uses UCI (Universal Chess Interface) notation and includes special tokens for player Elo ratings and game outcomes, allowing it to generate moves appropriate for different skill levels.

Developed by: David Hauser (https://github.com/kinggongzilla)
Model type: Causal language model (decoder-only transformer)
Language(s): Chess UCI notation
License: Apache 2.0
Architecture: Qwen2-style (SmolLM3 base)
Parameters: ~250M

Model Sources

Repository: https://github.com/kinggongzilla/chess-bot-3000

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("daavidhauser/chess-bot-3000-100m")
tokenizer = AutoTokenizer.from_pretrained("daavidhauser/chess-bot-3000-100m")

prompt = "<BOG> <WHITE:1500> <BLACK:1600> <BLACK_WIN> e2e4"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_new_tokens=1)
print(tokenizer.decode(outputs[0]))

Note:

By passing the first move "e2e4", the model will generate the next move for black
By setting <BLACK_WIN> the model is being conditioned to predict moves where black wins
ELO level tokens for white and black will induce the model to play at the given ELO level.

Evaluation

250m model performance against stockfish at various ELO levels

Uses

Direct Use

The model can be used for:

Chess move prediction and game continuation
Generating chess games at specific skill levels (by conditioning on Elo tokens)
Chess position evaluation through next-move probabilities
Chess education and analysis tools

Training Details

Training Data

The model was trained on approximately 100 million chess games from the Lichess open database (January 2024). Games were converted to UCI notation and augmented with:

Player Elo ratings (rounded to nearest 100, range 0-3500)
Game outcomes (<WHITE_WIN>, <BLACK_WIN>, <DRAW>)
Special tokens for game boundaries

Each training example follows the format: <BOG> <WHITE:1500> <BLACK:1500> <DRAW> ... <EOG> In this representation, each chess move (half-move) corresponds to one token.

Training Infrastructure:

Framework: Nanotron (PyTorch)
Hardware: 1x NVIDIA A100 GPU
Total training time: ~5 hours

Downloads last month: 44

Safetensors

Model size

0.2B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including daavidhauser/chess-bot-3000-250m

chess-bot-3000

Collection

A set of LLMs trained on chess moves. • 2 items • Updated 6 days ago