T5-Base News Summarizer (Multi-Style)

This model is a fine-tuned version of google/flan-t5-base trained on samples from CNN/DailyMail and XSum.
It generates news summaries in three styles: Harsh (Concise), Standard, and Detailed.


Model Description

  • Model Type: Sequence-to-Sequence Transformer (T5)
  • Language: English
  • Base Model: google/flan-t5-base
  • Training Data: ~9k mixed samples from CNN/DailyMail & XSum

Key Features

This model supports a Style Prompt that determines summary length and density:

  1. Harsh

    • Very concise
    • Headline-like
    • Trained mostly on XSum
  2. Standard

    • Balanced, general-purpose summarization
  3. Detailed

    • Longer, more contextual summaries
    • Trained with CNN/DailyMail

Usage

from transformers import pipeline

summarizer = pipeline("summarization", model="Hiratax/t5-news-summarizer")

text = """
The James Webb Space Telescope (JWST) has captured a lush landscape of stellar birth. 
The new image shows the Cosmic Cliffs, which are the edge of a giant gaseous cavity within the star-forming region NGC 3324.
"""

# 1. Standard
print(summarizer("summarize standard: " + text))

# 2. Harsh (Headline)
print(summarizer("summarize harsh: " + text))

# 3. Detailed
print(summarizer("summarize detailed: " + text))

Recommended Inference Parameters

Style Min Length Max Length Length Penalty Repetition Penalty N-Gram Block
Harsh 10 35% of input 1.0 2.0 3
Standard 60 150 2.0 1.5 3
Detailed 50% input 150% of input 1.5 1.2 4

Tip:
"Detailed" style benefits from no_repeat_ngram_size=4 to avoid repeated openings.


Training Procedure

Hyperparameters

  • Epochs: 8
  • Learning Rate: 1e-4
  • Batch Size: 4
  • Gradient Accumulation: 2
  • Weight Decay: 0.01
  • Optimizer: AdamW
  • Precision: FP16

Data Strategy

  • Harsh โ†’ XSum (abstractive, short)
  • Detailed โ†’ CNN/DailyMail (longer, higher detail)
  • Safety: Removed cases where summary > article length to reduce hallucinations

Limitations

  • May occasionally output the typo "occupys" (training noise).
  • Max input length: 512 tokens (longer text is truncated).
  • Model performance decreases on extremely long or highly technical articles.

License

Apache 2.0

Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Hiratax/t5-news-summarizer

Finetuned
(890)
this model