twitter xlm roberta base sentiment finetunned
citizenlabIntroduction
The CITIZENLAB/TWITTER-XLM-ROBERTA-BASE-SENTIMENT-FINETUNNED
model is a multilingual sequence classifier for sentiment analysis based on the XLM-Roberta architecture. It has been fine-tuned for detecting sentiment in text using data from the Cardiff NLP Group's sentiment classification model.
Architecture
The model utilizes the XLM-Roberta architecture, a robust multilingual transformer model capable of understanding and processing text in multiple languages. It supports text classification in a variety of languages, including English, Dutch, French, Portuguese, Italian, Spanish, German, Danish, Polish, and Afrikaans.
Training
The model is fine-tuned on the jigsaw_toxicity_pred
dataset. It is optimized for sentiment classification tasks and evaluated using metrics such as F1 score and accuracy. The evaluation results show varying precision, recall, and F1-scores for negative, neutral, and positive sentiment classes.
Guide: Running Locally
-
Install Transformers Library
Ensure you have thetransformers
library installed. You can install it via pip:pip install transformers
-
Load the Model
Use the following code to initiate the sentiment analysis:from transformers import pipeline model_path = "citizenlab/twitter-xlm-roberta-base-sentiment-finetunned" sentiment_classifier = pipeline("text-classification", model=model_path, tokenizer=model_path) result = sentiment_classifier("this is a lovely message") print(result)
-
Hardware Requirements
For efficient processing, especially for large datasets or real-time analysis, using cloud GPUs is recommended. Services like AWS, Google Cloud, or Azure offer scalable GPU resources.
License
The model's use and distribution are subject to the terms and conditions specified by Hugging Face and any additional licensing provided by the model creators, CitizenLab.