--- title: MiniMind Max2 API emoji: 🧠 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false license: apache-2.0 tags: - text-generation - moe - fastapi - language-model --- # 🧠 MiniMind Max2 API **Tiny Model, Powerful Experience** - An efficient language model API with FastAPI backend. ## Features - **Mixture of Experts (MoE)**: Only 25% of parameters activated per token - **Grouped Query Attention**: 4:1 ratio for memory efficiency - **FastAPI Backend**: RESTful API with automatic docs - **Gradio Interface**: Interactive UI for testing ## API Endpoints | Endpoint | Method | Description | |----------|--------|-------------| | `/docs` | GET | Swagger UI documentation | | `/generate` | POST | Generate text from prompt | | `/model-info` | GET | Get model architecture info | | `/health` | GET | Health check | | `/gradio` | GET | Interactive Gradio interface | ## Example Usage ```python import requests response = requests.post( "https://your-space.hf.space/generate", json={ "prompt": "Once upon a time", "max_new_tokens": 100, "temperature": 0.8 } ) print(response.json()["generated_text"]) ``` ## Model Variants | Model | Total Params | Active Params | Target | |-------|-------------|---------------|--------| | max2-nano | 500M | 125M | IoT, Mobile | | max2-lite | 1.5B | 375M | Mobile, Tablet | | max2-pro | 3B | 750M | Desktop | ## License Apache 2.0