supralabs-50M-testing

This is an experimental ChatML SFT run from SupraLabs/Supra-1.5-50M-Base-exp.

Training Setup

Field Value
Base model SupraLabs/Supra-1.5-50M-Base-exp
Output repo User01110/supralabs-50M-testing
Sequence length 1024
Max optimizer steps 10,000
Per-device batch size 128
Gradient accumulation 4
Sample presentations per GPU 5,120,000
Max token slots per GPU 5,242,880,000
Learning rate 2.00e-04
Warmup steps 100
Weight decay 0.05
Save/push cadence every 1,000 optimizer steps plus final
Loss mask assistant response only
Chat format ChatML
System prompt You are a helpful assistant.

The stream reloops datasets as needed to reach the fixed step budget. Cutecat6152/python-data-basic is capped at three passes because it only has 100 rows.

Unique one-pass source rows listed below: 3,667,971. First-cycle source presentations with the python-data-basic cap included: 3,668,171. The 20k-step training budget presents 5,120,000 examples per GPU, so larger sources are expected to reloop during training.

ChatML Compatibility

The tokenizer is saved with:

Token Purpose
`< im_start
`< im_end

The uploaded tokenizer includes the ChatML template, so inference and future SFT should not require manually adding these tokens again.

Example prompt:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Explain what a neural network is in simple terms."},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

Dataset Mix

Dataset Config Split Rows Schema Mapping Pass policy
nvidia/Nemotron-SFT-Instruction-Following-Chat-v2 default reasoning_off 1,068,273 messages[{role, content, reasoning_content}] user/assistant message pairs; reasoning_off only reloops as needed
microsoft/orca-math-word-problems-200k default train 200,035 question, answer user=question; assistant=answer reloops as needed
TIGER-Lab/MathInstruct default train 262,039 instruction, output user=instruction; assistant=output reloops as needed
Programming-Language/codeagent-python default train 296,837 prompt, response user=prompt; assistant=response reloops as needed
Cutecat6152/python-data-basic default train 100 id, instruction, response user=instruction; assistant=response max 3 passes, 300 presentations max
flytech/python-codes-25k default train 49,626 instruction, input, output, text user=instruction plus optional Input block; assistant=output reloops as needed
QuixiAI/open-instruct-uncensored default train 1,756,115 dataset, id, messages[{role, content}] user/assistant message pairs reloops as needed
openai/gsm8k main train 7,473 question, answer user=question; assistant=answer reloops as needed
openai/gsm8k socratic train 7,473 question, answer user=question; assistant=answer reloops as needed
EleutherAI/arithmetic 10 selected subsets validation raw JSONL 20,000 context, completion user=context with trailing Answer: stripped; assistant=completion reloops as needed

Notes

  • Dataset schemas and row counts were checked through Hugging Face Dataset Viewer metadata where available.
  • Nemotron is loaded from the direct reasoning_off.jsonl file to avoid mixing in reasoning-on schema fields.
  • EleutherAI arithmetic is loaded from raw JSONL files to avoid old dataset-script loading issues.
  • RoPE buffers and tokenizer/model load are verified during final export.
Downloads last month
91
Safetensors
Model size
51.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for User01110/supralabs-50M-testing

Finetuned
(5)
this model