opus mt ru en LLM Model — Open LLM List

Introduction

The OPUS-MT-RU-EN model is developed by the Language Technology Research Group at the University of Helsinki. It is a transformer-based model designed for translating Russian text to English. The model leverages the OPUS dataset and is licensed under CC-BY-4.0.

Architecture

Model Type: Transformer-align
Languages:
- Source: Russian
- Target: English

Training

Training Data: Utilizes the OPUS dataset, a comprehensive collection for language translation tasks.
Preprocessing: Involves normalization and SentencePiece tokenization.
Original Weights: Available for download as opus-2020-02-26.zip.
Evaluation: The model has been evaluated with BLEU and chr-F scores across multiple test sets, with scores ranging from 27.9 to 61.1 for different datasets.

Guide: Running Locally

To run the OPUS-MT-RU-EN model locally, follow these steps:

Install Libraries: Ensure you have the transformers library installed.
```
pip install transformers
```

Load the Model:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-ru-en")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-ru-en")

Cloud GPUs: For faster processing, consider using cloud-based GPU services like AWS, Google Cloud, or Azure.

License

The OPUS-MT-RU-EN model is released under the CC-BY-4.0 license, allowing for sharing and adaptation with appropriate credit.

More Related APIs in Translation