rubert base cased sentiment rusentiment

blanchefort

Introduction

The rubert-base-cased-sentiment-rusentiment model is a fine-tuned version of DeepPavlov's rubert-base-cased-conversational model. It is designed for sentiment analysis on Russian text using the RuSentiment dataset. The model classifies text into three sentiment categories: Neutral, Positive, and Negative.

Architecture

The model is based on the BERT (Bidirectional Encoder Representations from Transformers) architecture and uses the AutoModelForSequenceClassification class from the Hugging Face Transformers library. It employs a fast tokenizer from BERT to process text inputs efficiently.

Training

This model was trained on the RuSentiment dataset, an enriched sentiment analysis dataset for Russian social media. The dataset was presented in the Proceedings of COLING 2018, providing a comprehensive resource for evaluating sentiment in Russian text.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies: Ensure you have Python installed, along with the transformers and torch libraries.

    pip install transformers torch
    
  2. Load the Model and Tokenizer:

    import torch
    from transformers import AutoModelForSequenceClassification, BertTokenizerFast
    
    tokenizer = BertTokenizerFast.from_pretrained('blanchefort/rubert-base-cased-sentiment-rusentiment')
    model = AutoModelForSequenceClassification.from_pretrained('blanchefort/rubert-base-cased-sentiment-rusentiment', return_dict=True)
    
  3. Define the Prediction Function:

    @torch.no_grad()
    def predict(text):
        inputs = tokenizer(text, max_length=512, padding=True, truncation=True, return_tensors='pt')
        outputs = model(**inputs)
        predicted = torch.nn.functional.softmax(outputs.logits, dim=1)
        predicted = torch.argmax(predicted, dim=1).numpy()
        return predicted
    
  4. Predict Sentiments: Use the predict function to analyze sentiment in a given text.

To leverage enhanced computational power and speed up the process, consider using cloud GPU services such as Google Colab, AWS, or Azure.

License

The use and distribution of the model are subject to the terms outlined by Hugging Face. Always ensure compliance with these terms when deploying the model in applications.

More Related APIs in Text Classification