Spaces:

fariasultana
/

MiniMind-API

Runtime error

App Files Files Community

MiniMind-API / README.md

fariasultana

MiniMind Max2 API - Gradio Interface

bd21ba5 verified 13 days ago

preview code

raw

history blame contribute delete

1.47 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: MiniMind Max2 API
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - text-generation
  - moe
  - fastapi
  - language-model

🧠 MiniMind Max2 API

Tiny Model, Powerful Experience - An efficient language model API with FastAPI backend.

Features

Mixture of Experts (MoE): Only 25% of parameters activated per token
Grouped Query Attention: 4:1 ratio for memory efficiency
FastAPI Backend: RESTful API with automatic docs
Gradio Interface: Interactive UI for testing

API Endpoints

Endpoint	Method	Description
`/docs`	GET	Swagger UI documentation
`/generate`	POST	Generate text from prompt
`/model-info`	GET	Get model architecture info
`/health`	GET	Health check
`/gradio`	GET	Interactive Gradio interface

Example Usage

import requests

response = requests.post(
    "https://your-space.hf.space/generate",
    json={
        "prompt": "Once upon a time",
        "max_new_tokens": 100,
        "temperature": 0.8
    }
)
print(response.json()["generated_text"])

Model Variants

Model	Total Params	Active Params	Target
max2-nano	500M	125M	IoT, Mobile
max2-lite	1.5B	375M	Mobile, Tablet
max2-pro	3B	750M	Desktop

License

Apache 2.0