rubert base cased sentiment rusentiment
blanchefortIntroduction
The rubert-base-cased-sentiment-rusentiment
model is a fine-tuned version of DeepPavlov's rubert-base-cased-conversational
model. It is designed for sentiment analysis on Russian text using the RuSentiment dataset. The model classifies text into three sentiment categories: Neutral, Positive, and Negative.
Architecture
The model is based on the BERT (Bidirectional Encoder Representations from Transformers) architecture and uses the AutoModelForSequenceClassification
class from the Hugging Face Transformers library. It employs a fast tokenizer from BERT to process text inputs efficiently.
Training
This model was trained on the RuSentiment dataset, an enriched sentiment analysis dataset for Russian social media. The dataset was presented in the Proceedings of COLING 2018, providing a comprehensive resource for evaluating sentiment in Russian text.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies: Ensure you have Python installed, along with the
transformers
andtorch
libraries.pip install transformers torch
-
Load the Model and Tokenizer:
import torch from transformers import AutoModelForSequenceClassification, BertTokenizerFast tokenizer = BertTokenizerFast.from_pretrained('blanchefort/rubert-base-cased-sentiment-rusentiment') model = AutoModelForSequenceClassification.from_pretrained('blanchefort/rubert-base-cased-sentiment-rusentiment', return_dict=True)
-
Define the Prediction Function:
@torch.no_grad() def predict(text): inputs = tokenizer(text, max_length=512, padding=True, truncation=True, return_tensors='pt') outputs = model(**inputs) predicted = torch.nn.functional.softmax(outputs.logits, dim=1) predicted = torch.argmax(predicted, dim=1).numpy() return predicted
-
Predict Sentiments: Use the
predict
function to analyze sentiment in a given text.
To leverage enhanced computational power and speed up the process, consider using cloud GPU services such as Google Colab, AWS, or Azure.
License
The use and distribution of the model are subject to the terms outlined by Hugging Face. Always ensure compliance with these terms when deploying the model in applications.