ru Roberta large

ai-forever

Introduction

RUROBERTA-LARGE is a pretrained Transformer-based language model designed for processing the Russian language. It specializes in fill-mask tasks, utilizing an encoder architecture with a Byte Pair Encoding (BBPE) tokenizer. This model was developed by the SberDevices team, and detailed information is available in a preprint on arXiv: 2309.10931.

Architecture

  • Task: Mask filling
  • Type: Encoder
  • Tokenizer: BBPE
  • Dictionary Size: 50,257
  • Number of Parameters: 355 million

Training

The RUROBERTA-LARGE model was pretrained using 250 GB of data by the SberDevices team, leveraging their resources for effective model training and evaluation. The pretraining process is documented in the team's publication.

Guide: Running Locally

  1. Environment Setup: Ensure you have Python and PyTorch installed. Create a virtual environment for the project.
  2. Install Transformers: Use pip install transformers to install the Hugging Face Transformers library.
  3. Load the Model: Utilize the Transformers library to load RUROBERTA-LARGE from the model hub.
  4. Run Inference: Use the fill-mask pipeline to perform predictions with the model.

Cloud GPUs: For efficient processing, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure to handle the computational demands of model inference.

License

For licensing information, refer to the model card on Hugging Face's model hub or the respective publication documentation.

More Related APIs in Fill Mask