Model Card for Legal-Tech-LLaMA-1B-LoRA

A lightweight Legal Tech Question-Answering model fine-tuned on domain-specific legal datasets (contracts, statutes, case law, compliance FAQs).
Built using Unsloth’s 4-bit quantized LLaMA-3.2-1B for efficient inference and memory-friendly fine-tuning.


Model Details

Model Description

This model specializes in legal question-answering, document reasoning, and legal text summarization.
It is fine-tuned using LoRA adapters with PEFT for efficient task-specific adaptation.
The dataset combines curated Legal Q&A pairs derived from public legal sources, open government acts, and synthesized question-answer examples using large instruction models (LLaMA-4-Scout + Groq).

  • Developed by: Arav Saxena
  • Funded by: Independent Research / Self-funded
  • Shared by: Arav Saxena (Legal AI Research Series)
  • Model type: Instruction-tuned text generation (LoRA adapter)
  • Language(s): English
  • License: Apache 2.0 (follows base model’s terms)
  • Finetuned from model: unsloth/llama-3.2-1b-bnb-4bit

Model Sources

  • Repository: [Coming soon — Legal-Tech-LLaMA GitHub Repo]
  • Paper: N/A (Independent fine-tuning experiment)
  • Demo: Streamlit + FastAPI deployment (endpoint: /hackrx/run)

Uses

Direct Use

  • Legal question answering (acts, rights, contracts)
  • Compliance chatbots / law firm assistants
  • Legal document understanding & summarization
  • Semantic search augmentation (retrieval-augmented generation)

Downstream Use

  • Fine-tuning for specific jurisdictions (e.g., Indian law, US law)
  • Integration into legal document analysis pipelines
  • AI assistants for paralegals, compliance officers, or legal students

Out-of-Scope Use

  • Not for providing legally binding advice or court submissions
  • Not suitable for non-English legal systems or nuanced case reasoning
  • Should not replace professional legal counsel

Bias, Risks, and Limitations

  • Model responses may vary in accuracy depending on jurisdiction and source material.
  • Training data may contain biases from public legal corpora and synthetic data.
  • Model does not provide official legal interpretation or advice.
  • Limited context length (4k tokens) — may truncate long legal documents.

Recommendations

  • Always validate model outputs with qualified legal professionals.
  • Use within human-in-the-loop workflows for compliance and research.
  • Avoid relying solely on this model for mission-critical legal decisions.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_id = "aravsaxena/legal-tech-llama-1b-lora"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")

qa = pipeline("text-generation", model=model, tokenizer=tokenizer)
qa("Q: What are the key clauses in a Non-Disclosure Agreement?\nA:")
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using MeWan2808/SIT_legalTech_llama3.2 1