rubert base cased sentence
DeepPavlovIntroduction
The RuBERT-Base-Cased-Sentence model by DeepPavlov is a sentence encoder for the Russian language. It is a cased, 12-layer transformer model, designed for feature extraction and built on top of the RuBERT architecture. This model is fine-tuned on Google-translated SNLI and the Russian portion of the XNLI development set.
Architecture
The model is a 12-layer transformer with 768 hidden units and 12 attention heads, comprising 180 million parameters. It initializes from RuBERT and applies mean pooling to token embeddings for sentence representation, similar to Sentence-BERT.
Training
The RuBERT-Base-Cased-Sentence model was initially based on RuBERT and fine-tuned using a combination of datasets. It leverages the SNLI corpus, translated into Russian, and the Russian segment of the XNLI dataset. This fine-tuning process enhances its ability to perform natural language inference tasks in Russian.
Guide: Running Locally
- Clone the Repository: Clone the model repository from Hugging Face.
- Install Dependencies: Ensure you have Python and PyTorch installed. Install additional dependencies using pip.
- Load the Model: Use the Hugging Face Transformers library to load the model and tokenizer.
- Inference: Input sentences for encoding, using the model to extract features.
Suggested Cloud GPUs
Running this model efficiently would benefit from using cloud GPUs such as those offered by AWS (Amazon EC2), Google Cloud Platform, or Microsoft Azure.
License
Information about the licensing for the RuBERT-Base-Cased-Sentence model is available in the model's repository on Hugging Face. Ensure compliance with the terms provided.