Today we are releasing Supra Title (experimental), and it is unlike anything we have shipped before. It is not a general-purpose language model. It does not answer questions or write code. It does exactly one thing: generate short, accurate titles for chat conversations.

The problem it solves

Every AI chat platform needs to title conversations automatically. Claude does it, ChatGPT does it, every local app does it. The typical approach is to send the first message to the same large model that handles everything else and ask it to produce a title. It works fine. But it is wasteful: you are spinning up a 7B model, burning context, and waiting for a response just to get three words back.

Supra Title is purpose-built for this exact task. 350M parameters, GGUF format, runs on any hardware, and produces a title in milliseconds. No system prompt needed, just send the user message directly.

How it works

The integration pattern is dead simple. When a user sends a message, you fire two requests in parallel: one to your main model for the actual response, and one to Supra Title for the conversation title. By the time the main model finishes, the title is already there.

User sends message
Supra Title
title in ms
&
Main LLM
full response

No system prompt. No special formatting. Just the raw user message in the user turn and the model returns a short title. That is the entire interface.

Example outputs

The model handles everything from one-liners to long multi-topic messages:

// user message
bruh my wifi keeps disconnecting every 10 minutes 😭
WiFi Issues
// user message
what's the easiest way to make fluffy pancakes?
Fluffy Pancakes
// user message
i accidentally sent a message to the wrong person and now i'm dying inside
Wrong Message
// user message
can someone explain taxes to me like i'm five because none of this makes sense
Understanding Taxes
// user message
I've been trying to learn guitar for a few months now, but I still struggle with switching between chords smoothly...
Guitar Practice Tips
// user message
My friend and I are arguing about whether humanity would have progressed faster if the internet had been invented fifty years earlier...
Earlier Internet Debate

Model details

Supra Title is a fine-tune of LiquidAI LFM2.5-350M-Base, a liquid foundation model architecture. The GGUF quantizations available cover the full range from Q2 to BF16:

QuantizationFile sizeUse case
Q2_K_L177 MBabsolute minimum VRAM
Q3_K_M193 MBlow memory devices
Q4_K_M229 MBrecommended default
Q5_K_M260 MBhigher quality
Q6_K293 MBnear lossless
Q8_0379 MBmaximum quality, CPU friendly
BF16 / F16711 MBfull precision

Quick start

With Ollama (easiest):

// ollama ollama run hf.co/SupraLabs/Supra-Title-350M-exp-GGUF:Q4_K_M

With llama.cpp directly:

// llama.cpp llama-server -hf SupraLabs/Supra-Title-350M-exp-GGUF:Q4_K_M

What's next

This is an experimental release. We are actively expanding the SFT dataset toward 115,000 high-quality samples and exploring preference optimization to push title accuracy further. A full non-experimental release is in development. The current weights are out so the community can test real-world performance and give us feedback before we finalize anything.

It is small, it is fast, and it does one thing well. Go try it.

// links Model   → huggingface.co/SupraLabs/Supra-Title-350M-exp-GGUF
License → Apache-2.0
Base    → LiquidAI/LFM2.5-350M-Base
#release #supra-title #gguf #title-generation #edge-ai #tinyml #open-source #lfm2