Introduction

RUBERT-BASE is a pretrained transformer language model designed for the Russian language. Developed by the SberDevices team, it is optimized for the task of mask filling using PyTorch.

Architecture

  • Task: Mask filling
  • Type: Encoder
  • Tokenizer: Byte Pair Encoding (BPE)
  • Dictionary Size: 120,138 tokens
  • Number of Parameters: 178 million
  • Training Data Volume: 30 GB

Training

The model was pretrained by the NLP core team at SberDevices. The training process and evaluation are detailed in the associated preprint.

Guide: Running Locally

  1. Install Dependencies:

    • Install Python and PyTorch.
    • Install Hugging Face Transformers library.
  2. Download the Model:

    • Access the model files from the Hugging Face Model Hub.
  3. Run the Model:

    • Use the Fill-Mask pipeline in Transformers to test the model.
  4. Optional: Utilize cloud GPUs from providers like AWS, Google Cloud, or Azure for improved performance.

License

This model is released under the Apache 2.0 license.

More Related APIs in Fill Mask