Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available:
6.1.0
metadata
title: MiniMind Max2 API
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
- text-generation
- moe
- fastapi
- language-model
🧠 MiniMind Max2 API
Tiny Model, Powerful Experience - An efficient language model API with FastAPI backend.
Features
- Mixture of Experts (MoE): Only 25% of parameters activated per token
- Grouped Query Attention: 4:1 ratio for memory efficiency
- FastAPI Backend: RESTful API with automatic docs
- Gradio Interface: Interactive UI for testing
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/docs |
GET | Swagger UI documentation |
/generate |
POST | Generate text from prompt |
/model-info |
GET | Get model architecture info |
/health |
GET | Health check |
/gradio |
GET | Interactive Gradio interface |
Example Usage
import requests
response = requests.post(
"https://your-space.hf.space/generate",
json={
"prompt": "Once upon a time",
"max_new_tokens": 100,
"temperature": 0.8
}
)
print(response.json()["generated_text"])
Model Variants
| Model | Total Params | Active Params | Target |
|---|---|---|---|
| max2-nano | 500M | 125M | IoT, Mobile |
| max2-lite | 1.5B | 375M | Mobile, Tablet |
| max2-pro | 3B | 750M | Desktop |
License
Apache 2.0