YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
NEXUS Shared Expert Weights (10K Steps)
Trained shared expert weights from NEXUS (Neural Expert Unified Specialization) calibration run.
Model Details
- Base Model: GPT-OSS 120B
- Training Steps: 10,000
- Method: Top-24 PCA-selected experts, frozen router
- Parameters: 896,106,240 (shared expert only)
- Size: 1.67GB (BF16)
- Training Config: Frozen router, advanced scheduler, KL distillation
What This Contains
This file contains ONLY the shared expert weights (216 parameter tensors) from a NEXUS-trained model.
To use:
- Start with base GPT-OSS 120B model
- Add NEXUS shared expert architecture
- Load these weights
Usage
import torch
# Load weights
shared_weights = torch.load("nexus_shared_expert_weights_10k.pt")
# Apply to model with NEXUS architecture
model.load_state_dict(shared_weights, strict=False)
About NEXUS
NEXUS enables efficient domain specialization of massive MoE models by training a small shared expert while keeping routed experts frozen.
See: https://github.com/yourusername/nexus
License
MIT
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support