XLM-RoBERTa Tourism Sentiment Classifier

Fine-tuned XLM-RoBERTa model for multilingual tourism comment sentiment analysis.

Model Description

This model classifies tourism comments in 39 languages into 3 sentiment categories:

positive (Positive/Tích cực)
neutral (Neutral/Trung lập)
negative (Negative/Tiêu cực)

Training Data

Languages: 39 languages (English, Korean, Russian, German, French, Italian, Spanish, and 32 more)
Dataset: 2,123 multilingual tourism comments from social media
Sources: Google Maps, TikTok, YouTube, Facebook
Split: 80% train (1,698), 20% validation (425)
Quality: Filtered for meaningful content (min 10 words)

Language Distribution (Top 10)

Language	Count	Percentage
English (en)	1,088	51.3%
Korean (ko)	306	14.4%
Russian (ru)	156	7.3%
German (de)	98	4.6%
French (fr)	59	2.8%
Italian (it)	50	2.4%
Spanish (es)	46	2.2%
Filipino (tl)	32	1.5%
Indonesian (id)	23	1.1%
Polish (pl)	22	1.0%

Sentiment Distribution

Sentiment	Count	Percentage
Positive	1,325	62.4%
Negative	547	25.8%
Neutral	251	11.8%

Performance

Metric	Score
Accuracy	80.71%
F1 Macro	66.00%
F1 Weighted	79.48%

Per-Class Performance

Sentiment	Precision	Recall	F1-Score	Support
Positive	89.67%	91.70%	90.67%	265
Neutral	50.00%	26.00%	34.21%	50
Negative	67.97%	79.09%	73.11%	110

Confusion Matrix

              Predicted
              pos  neu  neg
Actual  pos  243   8   14
        neu   10  13   27
        neg   18   5   87

Usage

import torch
import torch.nn as nn
from transformers import AutoTokenizer, AutoModel

# Define model architecture
class XLMRobertaSentimentClassifier(nn.Module):
    def __init__(self, n_classes=3, dropout=0.3):
        super(XLMRobertaSentimentClassifier, self).__init__()
        self.xlm_roberta = AutoModel.from_pretrained('xlm-roberta-base')
        self.dropout = nn.Dropout(dropout)
        self.classifier = nn.Linear(self.xlm_roberta.config.hidden_size, n_classes)
        
    def forward(self, input_ids, attention_mask):
        outputs = self.xlm_roberta(input_ids=input_ids, attention_mask=attention_mask)
        pooled_output = outputs[0][:, 0, :]
        pooled_output = self.dropout(pooled_output)
        logits = self.classifier(pooled_output)
        return logits

# Load model
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
tokenizer = AutoTokenizer.from_pretrained('xlm-roberta-base')

model = XLMRobertaSentimentClassifier(n_classes=3)
checkpoint = torch.load('xlm_sentiment_best_model.pt', map_location=device)
model.load_state_dict(checkpoint['model_state_dict'])
model = model.to(device)
model.eval()

# Predict (works with any language!)
text = "Beautiful beach, loved it!"  # English
# text = "아름다운 해변이에요!"  # Korean
# text = "Sehr schöner Strand!"  # German

encoding = tokenizer(
    text,
    add_special_tokens=True,
    max_length=256,
    padding='max_length',
    truncation=True,
    return_attention_mask=True,
    return_tensors='pt'
)

input_ids = encoding['input_ids'].to(device)
attention_mask = encoding['attention_mask'].to(device)

with torch.no_grad():
    outputs = model(input_ids=input_ids, attention_mask=attention_mask)
    probs = torch.softmax(outputs, dim=1)
    confidence, predicted = torch.max(probs, dim=1)

sentiments = ['positive', 'neutral', 'negative']
print(f"Sentiment: {sentiments[predicted.item()]} ({confidence.item():.4f})")
# Output: Sentiment: positive (0.9981)

Training Details

Base Model: xlm-roberta-base
Architecture: XLM-RoBERTa + Dropout (0.3) + Linear (768 → 3)
Parameters: ~270M (base) + 2,307 (classifier)
Training Time: ~15-20 minutes on CUDA
Epochs: 5 (best at epoch 5)
Batch Size: 16
Learning Rate: 2e-5
Optimizer: AdamW with linear warmup
Loss: CrossEntropyLoss

Training Progress

Epoch	Train Loss	Train Acc	Val Loss	Val Acc	F1 Macro
1	0.9240	59.25%	0.6880	62.35%	25.60%
2	0.6837	73.14%	0.5516	79.06%	53.71%
3	0.5876	77.68%	0.5722	77.41%	52.72%
4	0.4880	81.33%	0.5112	80.94%	60.46%
5	0.4056	83.80%	0.5240	80.71%	66.00% ⭐

Supported Languages

🌍 39 Languages: English, Korean, Russian, German, French, Italian, Spanish, Filipino, Indonesian, Polish, Portuguese, Chinese, Dutch, Afrikaans, Thai, Arabic, Romanian, Czech, Japanese, Catalan, Danish, Somali, Hebrew, Finnish, Welsh, Ukrainian, Turkish, Slovak, Swedish, Croatian, Norwegian, Hungarian, Estonian, Albanian, Bulgarian, Swahili, Greek, Macedonian, and more!

Features

✅ Truly Multilingual: Works with 39+ languages out of the box ✅ High Accuracy: 80.71% overall accuracy across all languages ✅ Zero-Shot Capability: Can handle new languages not in training set ✅ Production Ready: Used in real-world tourism monitoring system ✅ Fast Inference: ~80ms per comment on GPU

Limitations

Lower performance on neutral class due to data imbalance (11.8% of data)
English-dominant training data (51.3%) may affect non-English performance
Trained on tourism domain (may not generalize to other domains)
Requires more data for minority languages

Use Cases

International tourism review sentiment analysis
Multilingual social media monitoring
Cross-cultural customer feedback analysis
Global tourism demand analysis
Automated content moderation for tourism platforms

Benchmark Comparison

Model	Languages	F1 Macro	Accuracy	Domain
XLM-RoBERTa Tourism	39	66.00%	80.71%	Tourism
mBERT Sentiment	104	62.10%	78.50%	General
XLM-R Base	100	58.30%	75.20%	General

Citation

@misc{xlm-roberta-tourism-sentiment,
  author = {Strawberry0604},
  title = {XLM-RoBERTa Tourism Sentiment Classifier},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/Strawberry0604/xlm-roberta-tourism-sentiment}}
}

Contact

Repository: tourism-data-monitor
HuggingFace: @Strawberry0604

Model Card Authors

Strawberry0604

Model Card Contact

For questions and feedback, please open an issue in the GitHub repository.

Downloads last month: -

Evaluation results

F1 Macro
self-reported

0.660
Accuracy
self-reported

0.807