honkhazard-1
8.78M (4.19M embed, 2L/1H) | 176M seen | 32K vocab
an experiment to train only on synthetic messages! failed but published because why not :)
- parameters: 8.78M (4.19 embed, 4.19 head, 0.26 mlp, 0.13 attn)
- tokens seen: 176M
- num_layers: 2
- num_heads: 1
- vocab_size: 32768
tokenizer
experimented a bit with different vocab sizes, settled on 32K for now
pre-training
pre-trained only on SYNTH messages in the following format:
<|bos|><|user_start|>{{query}}<|user_end|><|assistant_start|><|reasoning_start|>{{synthetic_reasoning}}<|reasoning_end|>{{synthetic_answer}}<|assistant_end|>
post-training
no post-training of any form has been performed on this model
postmortem
text stops being coherent after ~8 tokens, although shows understanding to prompt. layers were misconfigured from the start without noticing, idea is obviously flawed, param count is too low, etc
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support