robertuito sentiment analysis
pysentimientoIntroduction
The RoBERTuito-Sentiment-Analysis model is designed for sentiment analysis in Spanish, particularly for text from Twitter. It utilizes the RoBERTuito base model, which is a variant of RoBERTa trained specifically on Spanish tweets. The model assigns sentiment labels of positive (POS), negative (NEG), and neutral (NEU).
Architecture
RoBERTuito-Sentiment-Analysis is built on the RoBERTuito model, a RoBERTa architecture fine-tuned with the TASS 2020 corpus, which includes approximately 5,000 tweets across various Spanish dialects. The model processes text through a transformer-based framework to classify sentiments effectively.
Training
The model was trained using the TASS 2020 corpus, focusing on sentiment analysis tasks and leveraging RoBERTuito's pre-trained capabilities on over 500 million Spanish tweets. The training included fine-tuning for tasks such as emotion detection, hate speech, irony, and sentiment analysis, achieving notable Macro F1 scores across these tasks.
Guide: Running Locally
To run the RoBERTuito-Sentiment-Analysis model locally, follow these steps:
-
Install
pysentimiento
:
Ensure you have Python installed, then run:pip install pysentimiento
-
Create an Analyzer:
from pysentimiento import create_analyzer analyzer = create_analyzer(task="sentiment", lang="es")
-
Predict Sentiment: Use the analyzer to predict sentiment:
analyzer.predict("Qué gran jugador es Messi")
-
Cloud GPUs:
For enhanced performance, consider using cloud GPU services such as Google Colab, AWS, or Azure.
License
The RoBERTuito-Sentiment-Analysis model and the pysentimiento
toolkit are available under open licenses, encouraging research and development in sentiment analysis and other NLP tasks. Ensure to cite the relevant papers when using the model in research.