Model Card for Legal-Tech-LLaMA-1B-LoRA
A lightweight Legal Tech Question-Answering model fine-tuned on domain-specific legal datasets (contracts, statutes, case law, compliance FAQs).
Built using Unsloth’s 4-bit quantized LLaMA-3.2-1B for efficient inference and memory-friendly fine-tuning.
Model Details
Model Description
This model specializes in legal question-answering, document reasoning, and legal text summarization.
It is fine-tuned using LoRA adapters with PEFT for efficient task-specific adaptation.
The dataset combines curated Legal Q&A pairs derived from public legal sources, open government acts, and synthesized question-answer examples using large instruction models (LLaMA-4-Scout + Groq).
- Developed by: Arav Saxena
- Funded by: Independent Research / Self-funded
- Shared by: Arav Saxena (Legal AI Research Series)
- Model type: Instruction-tuned text generation (LoRA adapter)
- Language(s): English
- License: Apache 2.0 (follows base model’s terms)
- Finetuned from model:
unsloth/llama-3.2-1b-bnb-4bit
Model Sources
- Repository: [Coming soon — Legal-Tech-LLaMA GitHub Repo]
- Paper: N/A (Independent fine-tuning experiment)
- Demo: Streamlit + FastAPI deployment (endpoint:
/hackrx/run)
Uses
Direct Use
- Legal question answering (acts, rights, contracts)
- Compliance chatbots / law firm assistants
- Legal document understanding & summarization
- Semantic search augmentation (retrieval-augmented generation)
Downstream Use
- Fine-tuning for specific jurisdictions (e.g., Indian law, US law)
- Integration into legal document analysis pipelines
- AI assistants for paralegals, compliance officers, or legal students
Out-of-Scope Use
- Not for providing legally binding advice or court submissions
- Not suitable for non-English legal systems or nuanced case reasoning
- Should not replace professional legal counsel
Bias, Risks, and Limitations
- Model responses may vary in accuracy depending on jurisdiction and source material.
- Training data may contain biases from public legal corpora and synthetic data.
- Model does not provide official legal interpretation or advice.
- Limited context length (4k tokens) — may truncate long legal documents.
Recommendations
- Always validate model outputs with qualified legal professionals.
- Use within human-in-the-loop workflows for compliance and research.
- Avoid relying solely on this model for mission-critical legal decisions.
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
model_id = "aravsaxena/legal-tech-llama-1b-lora"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
qa = pipeline("text-generation", model=model, tokenizer=tokenizer)
qa("Q: What are the key clauses in a Non-Disclosure Agreement?\nA:")
- Downloads last month
- 2