You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Sensix Paite 4B Message-Format (16-bit)

This model is a specialized version of Gemma 3 4B trained using a high-precision two-stage pipeline. Unlike standard instruction-tuned models, this version utilizes Message-Format Fine-Tuning, which results in significantly more natural, human-like dialogue and fluid conversational flow.

Why Message-Format?

Traditional "Instruction" tuning often leads to robotic or overly formal responses. By using a message-based conversational format during the SFT stage, this model has achieved a much higher degree of "natural talk," making it feel like a real interaction rather than a simple command-response system. It is specifically designed to handle the nuances of modern Paite speech.

Model Details

  • Base Model: unsloth/gemma-3-4b-it
  • Training Strategy: Conversational Message-Format (User/Model roles)
  • Precision: 16-bit bfloat16 (Full Precision)
  • Architecture: 4 Billion Parameters

Training Procedure

Stage 1: Continued Pre-Training (CPT)

Raw domain knowledge was injected using the PERFECT_PAITE_DATA.jsonl dataset to expand the model's vocabulary and internalize Paite grammatical structures.

  • Learning Rate: 2e-4
  • LoRA Config: r=128, alpha=256
  • Focus: Vocabulary acquisition and factual foundation.

Stage 2: Supervised Fine-Tuning (SFT)

Advanced fine-tuning using message-formatted conversational data to refine personality and dialogue flow.

  • Learning Rate: 2e-5 (Gentle alignment)
  • LoRA Config: r=64, alpha=128
  • Packing: Enabled for efficient training.

Technical Implementation: The Hard Merge Fix

To ensure 100% stability, this model was finalized using an Official Hard Merge (model.merge_and_unload()). This process physically fuses the trained weights into the base model. This specific method was chosen to bypass the "Calcium/Blades" gibberish bug common in native Gemma 3 saving methods, ensuring the model remains logical and coherent even during long-form generation.


Messaging Template

The model utilize the standard roles for seamless multi-turn conversations:

<start_of_turn>user
{message}<end_of_turn>
<start_of_turn>model
{response}<end_of_turn>

Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "sensix-zo/sensix-paite-4b-messages-16bit"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Na dam maw? Mimal kithutuahna thupitna hon hilhchian in."}
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations and Bias

This model is optimized for natural conversation in Paite. While it maintains English reasoning capabilities, it is recommended primarily for regional linguistic tasks. Users should verify technical or safety-critical information.

Downloads last month
82
Safetensors
Model size
4B params
Tensor type
F32
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sensix-zo/sensix-paite-4b-multiturn-16bit

Finetuned
(622)
this model

Space using sensix-zo/sensix-paite-4b-multiturn-16bit 1