Spaces:

fariasultana
/

MiniMind-API

Runtime error

MiniMind-API / README.md

MiniMind Max2 API - Gradio Interface

bd21ba5 verified 14 days ago

1.47 kB

	---
	title: MiniMind Max2 API
	emoji: 🧠
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.44.0
	app_file: app.py
	pinned: false
	license: apache-2.0
	tags:
	- text-generation
	- moe
	- fastapi
	- language-model
	---

	# 🧠 MiniMind Max2 API

	Tiny Model, Powerful Experience - An efficient language model API with FastAPI backend.

	## Features

	- Mixture of Experts (MoE): Only 25% of parameters activated per token
	- Grouped Query Attention: 4:1 ratio for memory efficiency
	- FastAPI Backend: RESTful API with automatic docs
	- Gradio Interface: Interactive UI for testing

	## API Endpoints

	\| Endpoint \| Method \| Description \|
	\|----------\|--------\|-------------\|
	\| `/docs` \| GET \| Swagger UI documentation \|
	\| `/generate` \| POST \| Generate text from prompt \|
	\| `/model-info` \| GET \| Get model architecture info \|
	\| `/health` \| GET \| Health check \|
	\| `/gradio` \| GET \| Interactive Gradio interface \|

	## Example Usage

	```python
	import requests

	response = requests.post(
	"https://your-space.hf.space/generate",
	json={
	"prompt": "Once upon a time",
	"max_new_tokens": 100,
	"temperature": 0.8
	}
	)
	print(response.json()["generated_text"])
	```

	## Model Variants

	\| Model \| Total Params \| Active Params \| Target \|
	\|-------\|-------------\|---------------\|--------\|
	\| max2-nano \| 500M \| 125M \| IoT, Mobile \|
	\| max2-lite \| 1.5B \| 375M \| Mobile, Tablet \|
	\| max2-pro \| 3B \| 750M \| Desktop \|

	## License

	Apache 2.0