mt5 translate yue zh

botisan-ai

Introduction

The MT5-Translate-YUE-ZH model is a fine-tuned version of the google/mt5-base model, designed for translating Cantonese sentences into Mandarin. This model is developed using the x-tech/cantonese-mandarin-translations dataset.

Architecture

The model is based on the mT5 architecture, a multilingual variant of the T5 model, which is effective for translation tasks. It leverages the capabilities of transformers to perform text-to-text generation in multiple languages, focusing on Yue Chinese and Mandarin.

Training

Training and Evaluation Data

  • Dataset: x-tech/cantonese-mandarin-translations

Training Procedure

  • The training follows the guidelines provided in the Hugging Face Transformers library for PyTorch.

Training Hyperparameters

  • Learning Rate: 5e-05
  • Train Batch Size: 1
  • Eval Batch Size: 8
  • Seed: 42
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • LR Scheduler Type: Linear
  • Number of Epochs: 3.0

Training Results

  • The validation set is yet to be established, so training results are currently unavailable.

Framework Versions

  • Transformers: 4.12.5
  • PyTorch: 1.8.1
  • Datasets: 1.15.1
  • Tokenizers: 0.10.3

Guide: Running Locally

  1. Install Dependencies: Ensure you have Python and pip installed. Then, install the necessary libraries:
    pip install torch transformers datasets
    
  2. Clone Repository: Clone the model repository from Hugging Face.
    git clone https://huggingface.co/botisan-ai/mt5-translate-yue-zh
    
  3. Run the Model: Load and use the model in a Python script.
    from transformers import MT5ForConditionalGeneration, MT5Tokenizer
    
    model = MT5ForConditionalGeneration.from_pretrained('botisan-ai/mt5-translate-yue-zh')
    tokenizer = MT5Tokenizer.from_pretrained('google/mt5-base')
    
    input_text = "translate cantonese to mandarin: <your sentence here>"
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids
    outputs = model.generate(input_ids)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    

Suggestion: Cloud GPUs

For optimal performance, consider using cloud GPUs from services like AWS EC2, Google Cloud Platform, or Azure.

License

The model is licensed under the Apache 2.0 License, allowing for both personal and commercial use with proper attribution.

More Related APIs in Text2text Generation