Part of the Outlier shipping lineup. Outlier is a free macOS app that runs this model locally, with one click. Apple Silicon only.

Outlier Quick 26B-A4B (MLX 4-bit)

Sparse MoE tier (26B params, ~4B active per token). Sits between Lite and Core in latency, with stronger thinking-mode reasoning. Optimized for general chat and reasoning, not for code generation.

Try it in Outlier

The simplest way to use this model is through the Outlier app — open the tier picker, select Outlier Quick, click download, and chat. No setup, no Python, no MLX install, no token quotas.

➡ Download Outlier — outlier.host

A screenshot of the tier picker is at outlier.host/screenshots/tier-picker.png.

Load this directly (power users)

If you want the raw MLX-4bit weights without the app:

pip install mlx-lm
python -m mlx_lm.generate \
  --model Outlier-Ai/Outlier-Quick-26B-MLX-4bit \
  --prompt "Write a quicksort in Python." \
  --max-tokens 512

from mlx_lm import load, generate
model, tokenizer = load("Outlier-Ai/Outlier-Quick-26B-MLX-4bit")
print(generate(model, tokenizer, prompt="Hello", max_tokens=256))

Verified benchmarks

For σ-qualified MMLU, HumanEval, and Mac inference-speed numbers — with full provenance (source file, command, n, stderr, date) — see outlier.host/benchmarks.

Other Outlier shipping tiers

License

Apache 2.0 (inherits from upstream base model). Conversion artifact only — the underlying weights are governed by the base model's license.

Downloads last month: 69

Safetensors

Model size

5B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Evaluation results

accuracy on MMLU (stratified n=300)
test set self-reported

0.793
pass@1 on HumanEval
test set self-reported

0.128