ESG Greenwashing Detection Model

Multi-task PhoBERT model for Vietnamese ESG content analysis.

Model Architecture

4-task learning:

  1. Greenwashing Classification (Legitimate/Greenwashing/Uncertain)
  2. ESG Pillar Classification (Environmental/Social/Governance/General)
  3. Content Quality Scoring (0-100)
  4. ESG Score Prediction (0-100)

Training

  • Base Model: vinai/phobert-base
  • Strategy: stratified_group_kfold_5
  • Folds: 5
  • Total Samples: 5214

Performance

Greenwashing Detection

  • F1 Score: 0.596
  • Precision: 0.624
  • Recall: 0.605

Pillar Classification

  • Accuracy: 0.411
  • F1 Macro: 0.403

Quality Scoring

  • MAE: 8.326
  • R²: 0.401

ESG Score Prediction

  • MAE: 10.318
  • R²: 0.405
  • Correlation: 0.659

Usage

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("hiennthp/esg-bank-model-v3")
# Load model architecture then weights
# model = MultiTaskPhoBERT(config)
# model.load_state_dict(torch.load("best_model_fold0.pt"))

Files

  • best_model_fold0.pt - Fold 0 model weights
  • best_model_fold1.pt - Fold 1 model weights
  • best_model_fold2.pt - Fold 2 model weights
  • best_model_fold3.pt - Fold 3 model weights
  • best_model_fold4.pt - Fold 4 model weights
  • step5_metrics.json - Detailed metrics with per-fold breakdown
  • tokenizer/ - PhoBERT tokenizer files

Citation

@software{esg_greenwashing_model,
  author = {ESG Research Team},
  title = {Vietnamese ESG Greenwashing Detection Model},
  year = {2026},
  url = {https://huggingface.co/hiennthp/esg-bank-model-v3}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support