Instructions to use TUM-EDA/Flui3d-Chat-Qwen3-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use TUM-EDA/Flui3d-Chat-Qwen3-Base with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="TUM-EDA/Flui3d-Chat-Qwen3-Base", filename="unsloth_qwen3_baseline_bf16-00001-of-00014.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use TUM-EDA/Flui3d-Chat-Qwen3-Base with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16 # Run inference directly in the terminal: llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16 # Run inference directly in the terminal: llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16 # Run inference directly in the terminal: ./llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
Use Docker
docker model run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
- LM Studio
- Jan
- Ollama
How to use TUM-EDA/Flui3d-Chat-Qwen3-Base with Ollama:
ollama run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
- Unsloth Studio new
How to use TUM-EDA/Flui3d-Chat-Qwen3-Base with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TUM-EDA/Flui3d-Chat-Qwen3-Base to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for TUM-EDA/Flui3d-Chat-Qwen3-Base to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for TUM-EDA/Flui3d-Chat-Qwen3-Base to start chatting
- Docker Model Runner
How to use TUM-EDA/Flui3d-Chat-Qwen3-Base with Docker Model Runner:
docker model run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
- Lemonade
How to use TUM-EDA/Flui3d-Chat-Qwen3-Base with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16
Run and chat with the model
lemonade run user.Flui3d-Chat-Qwen3-Base-BF16
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16# Run inference directly in the terminal:
llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16# Run inference directly in the terminal:
./llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16# Run inference directly in the terminal:
./build/bin/llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16Use Docker
docker model run hf.co/TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16Flui3d Chat Model Qwen 3 Base
Model Description
This model is a Fine-tuned version of Qwen 3 designed for microfluidic chip design generation. The model translates high-level design requirements into structured microfluidic system descriptions.
The model generates outputs in a structured JSON format following a predefined schema (see: Output Format). The generated JSON describes a complete microfluidic chip, including:
- microfluidic components
- component parameters
- channel connections
- structural relationships between elements
This allows the model to act as a design file generator for microfluidic systems, enabling automated or AI-assisted microfluidic chip design workflows.
The repository includes:
- LoRA Adapter weights
- Quantized, split GGUF model files compatible with Ollama, may require merging before use
GGUF files can be merged using tools provided by llama.cpp (see: Merging Split GGUF Files).
Intended Use
This model is intended for:
- Automated microfluidic chip design generation
- AI-assisted CAD workflows for microfluidics
- Research in AI-assisted scientific design
- Programmatic generation of microfluidic device specifications
The model converts natural language design requirements into structured microfluidic design specifications.
Example Applications
- Rapid prototyping of microfluidic devices
- Automated generation of chip layouts
- Integration with microfluidic CAD pipelines
- AI-driven design exploration
Model Architecture
- Base Model: Qwen 3 32B
- Fine-tuning Method: SFT LoRA
- Reasoning Strategy: None
- Output Format: Structured JSON
The model is trained to produce schema-compliant structured outputs representing microfluidic chip configurations.
Output Format
The model generates JSON objects conforming to a predefined schema.
Schema definition:
https://github.com/TUM-EDA/Flui3d-Chat/blob/master/Dataset%20and%20Training%20Framework/datasets/resources/json_schemas/microfluidic_schema.json
The JSON output typically includes:
- Component definitions
- Channel connections
- Parameterized microfluidic elements
- Junction definitions
Example Output
{
"connections": [
{
"source": "inlet_1",
"target": "mixer_1"
},
{
"source": "inlet_2",
"target": "mixer_1"
},
{
"source": "mixer_1",
"target": "outlet_1"
}
],
"junctions": [
{
"id": "junction_1",
"type": "T-junction",
"source_1": "inlet_1",
"source_2": "inlet_2",
"target": "mixer_1"
}
],
"component_params": {
"mixers": [
{
"id": "mixer_1",
"num_turnings": 4
}
],
"delays": [],
"chambers": [],
"filters": []
}
Repository Contents
This repository includes:
1. LoRA Adapter
The LoRA adapter can be loaded on top of the base Qwen model for inference or further fine-tuning.
2. Quantized GGUF Models
Quantized GGUF format models compatible with:
- Ollama
- llama.cpp
Due to file size limitations, the GGUF models are split into multiple parts. These files must be merged before use.
Merging Split GGUF Files
To merge the split GGUF files, use the merging utilities from llama.cpp:
https://github.com/ggml-org/llama.cpp/blob/master/tools/gguf-split/README.md
Usage with Ollama
The merged GGUF file can be used with:
- Ollama
Example prompt:
Design a microfluidic chip with two inlets, one mixer, and a single outlet.
Limitations
- The model assumes valid schema-based output format and may produce invalid JSON if prompts are poorly structured.
- Generated designs should be validated before fabrication.
- The model does not replace domain expert verification.
Citation
If you use this model in academic work, please cite:
WILL BE PUBLISHED
- Downloads last month
- 4
16-bit
Model tree for TUM-EDA/Flui3d-Chat-Qwen3-Base
Base model
Qwen/Qwen3-32B
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16# Run inference directly in the terminal: llama-cli -hf TUM-EDA/Flui3d-Chat-Qwen3-Base:BF16