mt5 large parsinlu translation_en_fa

persiannlp

Introduction

The MT5-LARGE-PARSINLU-TRANSLATION_EN_FA is a machine translation model based on mT5, designed to translate text from English to Persian. It is part of the Persian NLP initiative and utilizes the parsinlu dataset.

Architecture

The model employs the mT5 architecture, a multilingual variant of T5 (Text-to-Text Transfer Transformer), optimized for conditional generation tasks such as translation. The model is trained specifically for English to Persian translation, leveraging the robust capabilities of mT5 for handling multiple languages.

Training

The model is fine-tuned on the parsinlu dataset, which provides a comprehensive set of English-Persian text pairs. The training process involves optimizing the model's ability to accurately translate English text into Persian while maintaining contextual integrity and fluency.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the Transformers Library: Ensure you have the transformers library from Hugging Face installed.

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import MT5ForConditionalGeneration, MT5Tokenizer
    
    model_size = "large"
    model_name = "persiannlp/mt5-large-parsinlu-translation_en_fa"
    tokenizer = MT5Tokenizer.from_pretrained(model_name)
    model = MT5ForConditionalGeneration.from_pretrained(model_name)
    
  3. Define a Function to Run the Model:

    def run_model(input_string, **generator_args):
        input_ids = tokenizer.encode(input_string, return_tensors="pt")
        res = model.generate(input_ids, **generator_args)
        output = tokenizer.batch_decode(res, skip_special_tokens=True)
        print(output)
        return output
    
  4. Translate Text: Use the run_model function to translate English sentences to Persian.

    run_model("Praise be to Allah, the Cherisher and Sustainer of the worlds;")
    

For optimal performance, especially with large models, it is recommended to use cloud-based GPUs. Consider platforms like AWS, Google Cloud, or Azure for GPU resources.

License

This model is distributed under the CC BY-NC-SA 4.0 license. This license allows for non-commercial use, sharing, and adaptation, with attribution and under the same license terms.

More Related APIs in Text2text Generation