Instructions to use sensix-zo/sensix-paite-4b-multiturn-16bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps
- Unsloth Studio
How to use sensix-zo/sensix-paite-4b-multiturn-16bit with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sensix-zo/sensix-paite-4b-multiturn-16bit to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for sensix-zo/sensix-paite-4b-multiturn-16bit to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for sensix-zo/sensix-paite-4b-multiturn-16bit to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="sensix-zo/sensix-paite-4b-multiturn-16bit", max_seq_length=2048, )
Sensix Paite 4B Message-Format (16-bit)
This model is a specialized version of Gemma 3 4B trained using a high-precision two-stage pipeline. Unlike standard instruction-tuned models, this version utilizes Message-Format Fine-Tuning, which results in significantly more natural, human-like dialogue and fluid conversational flow.
Why Message-Format?
Traditional "Instruction" tuning often leads to robotic or overly formal responses. By using a message-based conversational format during the SFT stage, this model has achieved a much higher degree of "natural talk," making it feel like a real interaction rather than a simple command-response system. It is specifically designed to handle the nuances of modern Paite speech.
Model Details
- Base Model: unsloth/gemma-3-4b-it
- Training Strategy: Conversational Message-Format (User/Model roles)
- Precision: 16-bit bfloat16 (Full Precision)
- Architecture: 4 Billion Parameters
Training Procedure
Stage 1: Continued Pre-Training (CPT)
Raw domain knowledge was injected using the PERFECT_PAITE_DATA.jsonl dataset to expand the model's vocabulary and internalize Paite grammatical structures.
- Learning Rate: 2e-4
- LoRA Config: r=128, alpha=256
- Focus: Vocabulary acquisition and factual foundation.
Stage 2: Supervised Fine-Tuning (SFT)
Advanced fine-tuning using message-formatted conversational data to refine personality and dialogue flow.
- Learning Rate: 2e-5 (Gentle alignment)
- LoRA Config: r=64, alpha=128
- Packing: Enabled for efficient training.
Technical Implementation: The Hard Merge Fix
To ensure 100% stability, this model was finalized using an Official Hard Merge (model.merge_and_unload()). This process physically fuses the trained weights into the base model. This specific method was chosen to bypass the "Calcium/Blades" gibberish bug common in native Gemma 3 saving methods, ensuring the model remains logical and coherent even during long-form generation.
Messaging Template
The model utilize the standard roles for seamless multi-turn conversations:
<start_of_turn>user
{message}<end_of_turn>
<start_of_turn>model
{response}<end_of_turn>
Usage Example
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "sensix-zo/sensix-paite-4b-messages-16bit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{"role": "user", "content": "Na dam maw? Mimal kithutuahna thupitna hon hilhchian in."}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations and Bias
This model is optimized for natural conversation in Paite. While it maintains English reasoning capabilities, it is recommended primarily for regional linguistic tasks. Users should verify technical or safety-critical information.
- Downloads last month
- 82