File size: 1,470 Bytes
c1384b2
bd21ba5
 
 
 
c1384b2
bd21ba5
c1384b2
 
bd21ba5
 
 
 
 
 
c1384b2
 
bd21ba5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
title: MiniMind Max2 API
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - text-generation
  - moe
  - fastapi
  - language-model
---

# 🧠 MiniMind Max2 API

**Tiny Model, Powerful Experience** - An efficient language model API with FastAPI backend.

## Features

- **Mixture of Experts (MoE)**: Only 25% of parameters activated per token
- **Grouped Query Attention**: 4:1 ratio for memory efficiency
- **FastAPI Backend**: RESTful API with automatic docs
- **Gradio Interface**: Interactive UI for testing

## API Endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/docs` | GET | Swagger UI documentation |
| `/generate` | POST | Generate text from prompt |
| `/model-info` | GET | Get model architecture info |
| `/health` | GET | Health check |
| `/gradio` | GET | Interactive Gradio interface |

## Example Usage

```python
import requests

response = requests.post(
    "https://your-space.hf.space/generate",
    json={
        "prompt": "Once upon a time",
        "max_new_tokens": 100,
        "temperature": 0.8
    }
)
print(response.json()["generated_text"])
```

## Model Variants

| Model | Total Params | Active Params | Target |
|-------|-------------|---------------|--------|
| max2-nano | 500M | 125M | IoT, Mobile |
| max2-lite | 1.5B | 375M | Mobile, Tablet |
| max2-pro | 3B | 750M | Desktop |

## License

Apache 2.0