robertuito emotion analysis
pysentimientoIntroduction
ROBERTUITO-EMOTION-ANALYSIS is a model designed for emotion detection in Spanish, based on RoBERTuito, a variant of the RoBERTa model. The model is trained on the TASS 2020 Task 2 corpus and is capable of identifying six Ekman emotions—anger, disgust, fear, joy, sadness, and surprise—plus a neutral class. It aims to facilitate emotion analysis tasks on Spanish social media, particularly Twitter.
Architecture
The model utilizes RoBERTuito, a pre-trained language model specifically tailored for user-generated text in Spanish. This model is part of the Transformer family, leveraging the architecture of RoBERTa which is well-suited for natural language processing tasks. RoBERTuito is trained on over 500 million tweets, enhancing its capability to understand and analyze Spanish social media content.
Training
The ROBERTUITO-EMOTION-ANALYSIS model was trained using the TASS 2020 Task 2 corpus. This corpus is focused on emotion detection in Spanish tweets. The model's performance is measured in terms of Macro F1 scores across four different tasks: emotion detection, hate speech, irony, and sentiment analysis. The results indicate that ROBERTUITO performs competitively against other models like roberta, bertin, and mbert_uncased.
Guide: Running Locally
To run the ROBERTUITO-EMOTION-ANALYSIS model locally:
- Installation: Ensure you have
pytorch
andpysentimiento
installed in your environment. You can use pip for installation:pip install torch pysentimiento
- Clone the Repository: Download the model from the Hugging Face model hub or clone the repository:
git clone https://github.com/pysentimiento/pysentimiento/
- Load the Model: Use the
pysentimiento
library to load and interact with the model. - Inference: Prepare your text data and use the model to predict emotions.
For enhanced performance, especially with larger datasets, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure.
License
The ROBERTUITO-EMOTION-ANALYSIS model and its associated resources are available under open licenses that facilitate academic and research use. Please review the specific licensing terms in the repository to ensure compliance with any conditions or restrictions.