MiniMind-API / README.md
fariasultana's picture
MiniMind Max2 API - Gradio Interface
bd21ba5 verified

A newer version of the Gradio SDK is available: 6.1.0

Upgrade
metadata
title: MiniMind Max2 API
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - text-generation
  - moe
  - fastapi
  - language-model

🧠 MiniMind Max2 API

Tiny Model, Powerful Experience - An efficient language model API with FastAPI backend.

Features

  • Mixture of Experts (MoE): Only 25% of parameters activated per token
  • Grouped Query Attention: 4:1 ratio for memory efficiency
  • FastAPI Backend: RESTful API with automatic docs
  • Gradio Interface: Interactive UI for testing

API Endpoints

Endpoint Method Description
/docs GET Swagger UI documentation
/generate POST Generate text from prompt
/model-info GET Get model architecture info
/health GET Health check
/gradio GET Interactive Gradio interface

Example Usage

import requests

response = requests.post(
    "https://your-space.hf.space/generate",
    json={
        "prompt": "Once upon a time",
        "max_new_tokens": 100,
        "temperature": 0.8
    }
)
print(response.json()["generated_text"])

Model Variants

Model Total Params Active Params Target
max2-nano 500M 125M IoT, Mobile
max2-lite 1.5B 375M Mobile, Tablet
max2-pro 3B 750M Desktop

License

Apache 2.0