GoEmotions DistilBERT Emotion Classifier

Model Overview

This model is a fine-tuned version of distilbert-base-uncased for multi-label emotion classification. It predicts the probability of 28 possible emotions (27 emotion categories plus neutral). The model is trained using the GoEmotions dataset.

This classifier takes in a short piece of text and outputs a list of emotions along with confidence scores.


Intended Use

  • Emotion detection in short texts such as comments, headlines, messages, or social media posts
  • Educational demonstrations of multi-label classification
  • Research or experimentation with emotion analysis

Not intended for clinical, legal, or high-stakes decision-making.


Dataset

Dataset: GoEmotions
Source: https://huggingface.co/datasets/go_emotions
Size: ~58,000 text examples
Labels: 28 emotions (multi-label)

Each example is annotated with zero or more emotions. The dataset is designed for studying fine-grained emotional expressions in short natural language.


Training Details

  • Base model: distilbert-base-uncased
  • Training: Fine-tuned for 1 epoch
  • Task: Multi-label classification
  • Loss function: Binary cross entropy
  • Input length: 128 tokens
  • Optimizer: AdamW
  • Batch size: 16

Code used in the notebook:

model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-uncased",
    num_labels=28,
    problem_type="multi_label_classification"
)
Downloads last month
29
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train djonuzi/goemotions-emotion-model