rubert base cased sentence

DeepPavlov

Introduction

The RuBERT-Base-Cased-Sentence model by DeepPavlov is a sentence encoder for the Russian language. It is a cased, 12-layer transformer model, designed for feature extraction and built on top of the RuBERT architecture. This model is fine-tuned on Google-translated SNLI and the Russian portion of the XNLI development set.

Architecture

The model is a 12-layer transformer with 768 hidden units and 12 attention heads, comprising 180 million parameters. It initializes from RuBERT and applies mean pooling to token embeddings for sentence representation, similar to Sentence-BERT.

Training

The RuBERT-Base-Cased-Sentence model was initially based on RuBERT and fine-tuned using a combination of datasets. It leverages the SNLI corpus, translated into Russian, and the Russian segment of the XNLI dataset. This fine-tuning process enhances its ability to perform natural language inference tasks in Russian.

Guide: Running Locally

  1. Clone the Repository: Clone the model repository from Hugging Face.
  2. Install Dependencies: Ensure you have Python and PyTorch installed. Install additional dependencies using pip.
  3. Load the Model: Use the Hugging Face Transformers library to load the model and tokenizer.
  4. Inference: Input sentences for encoding, using the model to extract features.

Suggested Cloud GPUs

Running this model efficiently would benefit from using cloud GPUs such as those offered by AWS (Amazon EC2), Google Cloud Platform, or Microsoft Azure.

License

Information about the licensing for the RuBERT-Base-Cased-Sentence model is available in the model's repository on Hugging Face. Ensure compliance with the terms provided.

More Related APIs in Feature Extraction