Configuration Parsing Warning:Invalid JSON for config file config.json

Orion Atlas 1B

Cendrix AI β€” orion-atlas-7b | GitHub | Paper


Status

🟑 Training in progress β€” Currently at iter 11,000 / 50,000. Weights will be published upon completion (March 25, 2026).

This repository contains the model architecture, config, and training details. Weights incoming.


Architecture

Orion Atlas 1B is the first model in the Orion Atlas family β€” a custom transformer architecture built from scratch by Cendrix AI.

Property Value
Parameters 1,147,766,784 (~1.15B)
Layers 24
Attention heads 16 (GQA: 4 KV heads)
Hidden dim 2048
FFN SwiGLU
Normalization RMSNorm
Position RoPE with YaRN context extension
Context 4,096 tokens (extensible via YaRN)
Tokenizer Custom SentencePiece, 32K vocab
Architecture Mini-Llama style: RoPE, GQA, SwiGLU, RMSNorm, Flash Attention

Model Family

Model Params Status
Orion Atlas 1B 1.15B 🟑 Training
Orion Atlas 3B ~3B πŸ“‹ Planned
Orion Atlas 7B 8.77B πŸ—οΈ Architecture released
Orion Atlas 14B ~14B πŸ“‹ Planned
Orion Atlas 37B ~37B πŸ“‹ Planned

The 7B model uses a Mamba-2 Hybrid + Differential Attention architecture β€” the first known combination of these techniques. See orion-atlas-7b.


Training

  • Pre-training data: FineWeb-Edu, SlimPajama, StarCoder, The Stack, OpenWebMath, Cosmopedia, GPT4All, Tulu-SFT (~17B tokens)
  • Hardware: NVIDIA H100 80GB HBM3 (RunPod)
  • Framework: Custom PyTorch training loop
  • Optimizer: AdamW, lr=3e-4, cosine decay
  • Batch: 40 Γ— 2048 tokens (effective ~82K tokens/step)
  • Speed: ~50,000 tok/s

Design Goals

Built for agentic tasks β€” tool calling, structured JSON output, multi-step reasoning. The 1B serves as the seed model for progressive distillation up to the 37B flagship.


Usage

This model uses a custom architecture (not HuggingFace Transformers compatible). See model.py for inference code.

# Coming with weights release

Citation

@misc{palermini2026orionatlas,
  title={Orion Atlas: A Mamba-2 Hybrid Architecture with Differential Attention for Agentic Language Models},
  author={Avery Palermini},
  year={2026},
  institution={Cendrix AI},
}

Β© 2026 Cendrix AI Β· Grove City, OH

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support