Configuration Parsing Warning:Invalid JSON for config file config.json
Orion Atlas 1B
Cendrix AI β orion-atlas-7b | GitHub | Paper
Status
π‘ Training in progress β Currently at iter 11,000 / 50,000. Weights will be published upon completion (March 25, 2026).
This repository contains the model architecture, config, and training details. Weights incoming.
Architecture
Orion Atlas 1B is the first model in the Orion Atlas family β a custom transformer architecture built from scratch by Cendrix AI.
| Property | Value |
|---|---|
| Parameters | 1,147,766,784 (~1.15B) |
| Layers | 24 |
| Attention heads | 16 (GQA: 4 KV heads) |
| Hidden dim | 2048 |
| FFN | SwiGLU |
| Normalization | RMSNorm |
| Position | RoPE with YaRN context extension |
| Context | 4,096 tokens (extensible via YaRN) |
| Tokenizer | Custom SentencePiece, 32K vocab |
| Architecture | Mini-Llama style: RoPE, GQA, SwiGLU, RMSNorm, Flash Attention |
Model Family
| Model | Params | Status |
|---|---|---|
| Orion Atlas 1B | 1.15B | π‘ Training |
| Orion Atlas 3B | ~3B | π Planned |
| Orion Atlas 7B | 8.77B | ποΈ Architecture released |
| Orion Atlas 14B | ~14B | π Planned |
| Orion Atlas 37B | ~37B | π Planned |
The 7B model uses a Mamba-2 Hybrid + Differential Attention architecture β the first known combination of these techniques. See orion-atlas-7b.
Training
- Pre-training data: FineWeb-Edu, SlimPajama, StarCoder, The Stack, OpenWebMath, Cosmopedia, GPT4All, Tulu-SFT (~17B tokens)
- Hardware: NVIDIA H100 80GB HBM3 (RunPod)
- Framework: Custom PyTorch training loop
- Optimizer: AdamW, lr=3e-4, cosine decay
- Batch: 40 Γ 2048 tokens (effective ~82K tokens/step)
- Speed: ~50,000 tok/s
Design Goals
Built for agentic tasks β tool calling, structured JSON output, multi-step reasoning. The 1B serves as the seed model for progressive distillation up to the 37B flagship.
Usage
This model uses a custom architecture (not HuggingFace Transformers compatible). See model.py for inference code.
# Coming with weights release
Citation
@misc{palermini2026orionatlas,
title={Orion Atlas: A Mamba-2 Hybrid Architecture with Differential Attention for Agentic Language Models},
author={Avery Palermini},
year={2026},
institution={Cendrix AI},
}
Β© 2026 Cendrix AI Β· Grove City, OH
- Downloads last month
- 4