Configuration Parsing
Warning:
Invalid JSON for config file config.json
Configuration Parsing
Warning:
Invalid JSON for config file tokenizer_config.json
YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
legal-nli-roberta-base
Model Overview
The legal-nli-roberta-base model is a fine-tuned RoBERTa-Base model specialized in Legal Natural Language Inference (NLI). Its purpose is to analyze the logical relationship between two pieces of text from legal documents (e.g., a court ruling and a proposed amendment, or a contract clause and a statement of fact). It classifies the relationship into one of three standard NLI categories: Entailment, Contradiction, or Neutral.
Model Architecture
- Base Model: RoBERTa-Base (A robustly optimized BERT approach).
- Task: Sequence Classification (
RobertaForSequenceClassification) on text pairs. - Mechanism: The model takes a pair of texts, the Premise (e.g., the contract clause) and the Hypothesis (e.g., the claim about the clause), separated by a special token. The Transformer encoder processes the concatenated input, and the final representation of the
[CLS]token is passed to a classification head. - Labels:
- Entailment (0): The Hypothesis must be true if the Premise is true.
- Contradiction (1): The Hypothesis must be false if the Premise is true.
- Neutral (2): The Hypothesis could be true or false.
Intended Use
- Contract Review Automation: Checking if a new clause contradicts an existing term or if a statement of compliance is supported by the contract text.
- Legal Research: Analyzing case law to determine if a new judgment supports or refutes a previous ruling or legal principle.
- Compliance Checks: Verifying internal company policies against external regulations.
Limitations and Ethical Considerations
- Domain-Specific Logic: Legal documents often rely on highly nuanced, domain-specific definitions and complex nested dependencies (e.g., "if A, then B, unless C is true"). The model may struggle with these deep, long-range logical structures.
- No Substitute for Counsel: This model is a tool for rapid analysis, not a provider of legal advice. Final interpretation and decision-making must be handled by qualified legal professionals.
- Data Bias: The model is trained on specific corpora of legal texts. If the data is biased towards certain jurisdictions or legal systems, performance on others will be poor.
- Max Length: The model's context window is limited (
max_position_embeddings=514). Long premises (e.g., entire contract sections) must be truncated, leading to potential loss of critical information.
Example Code
To analyze the relationship between a legal premise and a hypothesis:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "YourOrg/legal-nli-roberta-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Define Premise and Hypothesis
premise = "The supplier is obligated to deliver all specified goods no later than December 31, 2025, provided that the initial payment is received by October 1st."
hypothesis = "The supplier must deliver the goods by December 31, 2025, regardless of the payment date."
# Encode the text pair
encoded_input = tokenizer(
premise,
hypothesis,
truncation=True,
padding=True,
return_tensors="pt"
)
# Inference
with torch.no_grad():
outputs = model(**encoded_input)
logits = outputs.logits
# Get the predicted label
predicted_class_id = logits.argmax().item()
predicted_label = model.config.id2label[predicted_class_id]
print(f"Premise: {premise[:50]}...")
print(f"Hypothesis: {hypothesis}")
print(f"NLI Relationship: **{predicted_label}**")
# Expected Output: Contradiction (due to the "provided that" condition)
- Downloads last month
- 15
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support