t5_translate_en_ru_zh_small_1024
utrobinmvIntroduction
The T5 English, Russian, and Chinese Multilingual Machine Translation model is a transformer-based model designed for multitasking translation. It translates between language pairs: Russian-Chinese, Chinese-Russian, English-Chinese, Chinese-English, English-Russian, and Russian-English.
Architecture
This model employs the T5 transformer architecture, configured specifically for machine translation tasks. It allows direct translation between any combination of the supported languages by using a language identifier prefix such as 'translate to :'. The model can handle multilingual source text without specifying the source language explicitly.
Training
The model uses datasets such as CCMatrix and evaluates performance using metrics like SacreBLEU. It supports translations for Russian (ru_RU), Chinese (zh_CN), and English (en_US).
Guide: Running Locally
-
Setup Environment: Ensure you have Python installed along with the
transformers
library. -
Device Selection: Choose 'cuda' for GPU or 'cpu' for CPU-based translations.
-
Load Model and Tokenizer:
from transformers import T5ForConditionalGeneration, T5Tokenizer device = 'cuda' # or 'cpu' model_name = 'utrobinmv/t5_translate_en_ru_zh_small_1024' model = T5ForConditionalGeneration.from_pretrained(model_name) model.to(device) tokenizer = T5Tokenizer.from_pretrained(model_name)
-
Prepare Input Text:
- Use a prefix like 'translate to zh: ' to specify the target language.
- Example for Russian to Chinese:
prefix = 'translate to zh: ' src_text = prefix + "Цель разработки — предоставить пользователям личного синхронного переводчика."
-
Generate Translations:
input_ids = tokenizer(src_text, return_tensors="pt") generated_tokens = model.generate(**input_ids.to(device)) result = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True) print(result)
-
Cloud GPU Suggestion: For improved performance, consider using cloud platforms that offer GPU support, such as AWS, Google Cloud, or Microsoft Azure.
License
The model is licensed under the Apache-2.0 License.