Model Card for Legal-Tech-LLaMA-1B-LoRA

A lightweight Legal Tech Question-Answering model fine-tuned on domain-specific legal datasets (contracts, statutes, case law, compliance FAQs).
Built using Unsloth’s 4-bit quantized LLaMA-3.2-1B for efficient inference and memory-friendly fine-tuning.

Model Details

Model Description

This model specializes in legal question-answering, document reasoning, and legal text summarization.
It is fine-tuned using LoRA adapters with PEFT for efficient task-specific adaptation.
The dataset combines curated Legal Q&A pairs derived from public legal sources, open government acts, and synthesized question-answer examples using large instruction models (LLaMA-4-Scout + Groq).

Developed by: Arav Saxena
Funded by: Independent Research / Self-funded
Shared by: Arav Saxena (Legal AI Research Series)
Model type: Instruction-tuned text generation (LoRA adapter)
Language(s): English
License: Apache 2.0 (follows base model’s terms)
Finetuned from model: unsloth/llama-3.2-1b-bnb-4bit

Model Sources

Repository: [Coming soon — Legal-Tech-LLaMA GitHub Repo]
Paper: N/A (Independent fine-tuning experiment)
Demo: Streamlit + FastAPI deployment (endpoint: /hackrx/run)

Uses

Direct Use

Legal question answering (acts, rights, contracts)
Compliance chatbots / law firm assistants
Legal document understanding & summarization
Semantic search augmentation (retrieval-augmented generation)

Downstream Use

Fine-tuning for specific jurisdictions (e.g., Indian law, US law)
Integration into legal document analysis pipelines
AI assistants for paralegals, compliance officers, or legal students

Out-of-Scope Use

Not for providing legally binding advice or court submissions
Not suitable for non-English legal systems or nuanced case reasoning
Should not replace professional legal counsel

Bias, Risks, and Limitations

Model responses may vary in accuracy depending on jurisdiction and source material.
Training data may contain biases from public legal corpora and synthetic data.
Model does not provide official legal interpretation or advice.
Limited context length (4k tokens) — may truncate long legal documents.

Recommendations

Always validate model outputs with qualified legal professionals.
Use within human-in-the-loop workflows for compliance and research.
Avoid relying solely on this model for mission-critical legal decisions.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_id = "aravsaxena/legal-tech-llama-1b-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

qa = pipeline("text-generation", model=model, tokenizer=tokenizer)
qa("Q: What are the key clauses in a Non-Disclosure Agreement?\nA:")

Downloads last month: 2

MeWan2808
/

SIT_legalTech_llama3.2