rubert base cased sentiment rurewiews
blanchefortIntroduction
The RuBERT-Base-Cased-Sentiment-RuReviews model is designed for sentiment analysis of product reviews in Russian. It is a variant of the DeepPavlov/rubert-base-cased-conversational model, specifically trained on the RuReviews dataset for classifying sentiment into neutral, positive, and negative categories.
Architecture
The model is based on the BERT architecture and utilizes the transformers library. It supports PyTorch, TensorFlow, and JAX frameworks, ensuring compatibility across different machine learning environments. The model is identified by the tags: sentiment analysis, text classification, and is specifically tailored for the Russian language.
Training
The model was trained using the RuReviews dataset, which contains automatically annotated sentiment data for product reviews in Russian. The training process leverages the BERT architecture's capabilities to encode textual data for sentiment classification.
Guide: Running Locally
To use this model for sentiment analysis, follow these steps:
-
Install Dependencies: Ensure you have PyTorch and the transformers library installed in your Python environment.
-
Load the Model and Tokenizer:
import torch from transformers import AutoModelForSequenceClassification, BertTokenizerFast tokenizer = BertTokenizerFast.from_pretrained('blanchefort/rubert-base-cased-sentiment-rurewiews') model = AutoModelForSequenceClassification.from_pretrained('blanchefort/rubert-base-cased-sentiment-rurewiews', return_dict=True)
-
Define Prediction Function:
@torch.no_grad() def predict(text): inputs = tokenizer(text, max_length=512, padding=True, truncation=True, return_tensors='pt') outputs = model(**inputs) predicted = torch.nn.functional.softmax(outputs.logits, dim=1) predicted = torch.argmax(predicted, dim=1).numpy() return predicted
-
Run Predictions: Use the
predict
function to classify the sentiment of input text. -
Hardware Recommendations: If running locally becomes resource-intensive, consider using cloud GPU services such as AWS, Google Cloud, or Azure to handle the computations efficiently.
License
The model and its accompanying resources are subject to the licensing terms provided by Hugging Face and the respective dataset creators. Users should refer to the original model and dataset pages for detailed licensing information.