rut5 base multitask
cointegratedIntroduction
The RUT5-BASE-MULTITASK model is a smaller version of Google's MT5-BASE, focusing on Russian and English embeddings. It has been fine-tuned for a variety of text-based tasks including translation, paraphrasing, and dialogue generation.
Architecture
This model is based on the T5 architecture, designed for text-to-text tasks. It utilizes PyTorch and is compatible with JAX and Safetensors. It supports both Russian and English languages.
Training
The model has been fine-tuned for several specific tasks:
- Translation (ru-en, en-ru)
- Paraphrasing
- Text gap filling
- Text assembly from unordered words
- Text simplification
- Dialogue response generation
- Open-book question answering
- Question generation about a text
- News headline generation
Each task is specified with a task name followed by the input text using the |
separator.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install dependencies:
!pip install transformers sentencepiece
-
Load the model:
import torch from transformers import T5ForConditionalGeneration, T5Tokenizer tokenizer = T5Tokenizer.from_pretrained("cointegrated/rut5-base-multitask") model = T5ForConditionalGeneration.from_pretrained("cointegrated/rut5-base-multitask")
-
Define a generation function:
def generate(text, **kwargs): inputs = tokenizer(text, return_tensors='pt') with torch.no_grad(): hypotheses = model.generate(**inputs, num_beams=5, **kwargs) return tokenizer.decode(hypotheses[0], skip_special_tokens=True)
-
Run the model on a task:
print(generate('translate ru-en | Каждый охотник желает знать, где сидит фазан.')) # Output: Each hunter wants to know, where he is.
For enhanced performance, consider using cloud GPUs such as those offered by Google Cloud, AWS, or Azure.
License
The model is licensed under the MIT License, allowing for flexible use and modification.