xlm-roberta-xstance

This model is a fine-tuned version of FacebookAI/xlm-roberta-base on the ZurichNLP/x_stance dataset. It achieves the following results on the evaluation set:

Loss: 0.5225
Accuracy: 0.7687
Macro F1: 0.7687

Model description

xlm-roberta-xstance is fine-tuned to perform multi-target, cross-lingual stance detection. It is based on the multilingual xlm-roberta-base architecture. By utilizing the ZurichNLP/x_stance dataset, the model is trained to analyze a candidate comment alongside a target political question and classify the stance as either FAVOR or AGAINST.

Since it is built on a multilingual base model, it can leverage cross-lingual transfer, allowing stance detection across Swiss national languages (German, French, and Italian) and English.

Intended uses & limitations

Intended Uses

Stance Detection: Determining whether a text expresses a supportive (FAVOR) or opposing (AGAINST) stance toward a specific political issue or question.
Cross-Lingual Transfer: Evaluating stances in languages like German, French, Italian, and English, even when training data for a specific language is limited.
Cross-Target Stance Detection: Analyzing stances on new target questions or topics that the model was not explicitly exposed to during training.

Limitations

Domain Bias: The training data is derived from Swiss political debates and Smartvote candidate comments. Performance may vary when applied to informal social media comments, general-purpose texts, or political contexts outside of Switzerland.
Binary Assumption: Stance is framed as a binary classification (FAVOR or AGAINST), which may fail to capture nuanced stances, neutral sentiments, or complex conditional opinions.
Implicit Stances: The model can struggle with detecting implicit stances, where the position is not stated directly but must be inferred through context or pragmatic reasoning. For applications requiring deep reasoning over implicit stances, framing the task as a Natural Language Inference (NLI) problem (e.g., classifying whether a text entails or contradicts a target premise) might be a more suitable approach.

Training and evaluation data

The model was trained and evaluated using the ZurichNLP/x_stance dataset:

Dataset Composition: Contains over 150 political questions and 67,000 comments written by political candidates in Switzerland.
Languages: The training and validation sets comprise approximately 75% German data and 25% French data. The test split contains Italian samples as well, facilitating the evaluation of zero-shot cross-lingual transfer.
Input Format: The typical input consists of the target political question paired with the candidate's comment.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 850
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Macro F1
0.5537	1.0	2853	0.5749	0.7175	0.7175
0.4712	2.0	5706	0.4957	0.7588	0.7587
0.3804	3.0	8559	0.5225	0.7687	0.7687

Framework versions

Transformers 5.12.1
Pytorch 2.8.0a0+5228986c39.nv25.06
Datasets 5.0.0
Tokenizers 0.22.2

Downloads last month: 59

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for MatteoFasulo/xlm-roberta-xstance

Base model

FacebookAI/xlm-roberta-base

Finetuned

(4088)

this model

MatteoFasulo
/

xlm-roberta-xstance