AuditFlow — LLaMA 3.1 8B DPO
Modèle fine-tuné par DPO (Direct Preference Optimization) sur un corpus de 3723 paires générées depuis une base RAG Qdrant (fragments ISA/IFRS/SYSCOHADA/NEP/juridique) et les feedbacks RLHF collectés par l'application AuditFlow.
Thèse DIC3 SSI — ESP UCAD Dakar — Khadidiatou GUEYE (2025-2026)
Architecture
- Base :
meta-llama/Llama-3.1-8B-Instruct - QLoRA 4-bit NF4 + double quantification
- LoRA r=16, alpha=32, dropout=0.05
- Couches : q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
- β DPO = 0.1 | LR = 5e-05 | Epochs = 2
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
'meta-llama/Llama-3.1-8B-Instruct', torch_dtype=torch.bfloat16, device_map='auto'
)
model = PeftModel.from_pretrained(base, 'khadimeli/auditflow-ohada-llama3-dpo')
tok = AutoTokenizer.from_pretrained('khadimeli/auditflow-ohada-llama3-dpo')
messages = [
{"role": "system", "content": "Tu es un expert en audit OHADA/Sénégal."},
{"role": "user", "content": "La loi de Benford est violée (chi²=7698). ISA 240 ??"},
]
ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors='pt')
out = model.generate(ids, max_new_tokens=300)
print(tok.decode(out[0][ids.shape[-1]:], skip_special_tokens=True))
Normes couvertes
ISA 200 · ISA 230 · ISA 240 · ISA 315 · ISA 320 · ISA 500 · ISA 570 · ISA 700/705/706 · IFRS 9 · IAS 36 · IFRS 16 · IAS 37 · SYSCOHADA Révisé 2018
Dataset
- HuggingFace : Khadimeli/auditflow-ohada-dpo-dataset
- Total paires après filtrage : 3723
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for Khadimeli/auditflow-ohada-llama3-dpo
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct