mbart large 50 many to one mmt LLM Model

Introduction

MBART-50 Many to One Multilingual Machine Translation is a model fine-tuned from mBART-large-50, designed for multilingual machine translation. It was introduced in the paper "Multilingual Translation with Extensible Multilingual Pretraining and Finetuning." The model allows direct translation between any pair of 50 languages.

Architecture

The model is based on the mBART architecture, which is a sequence-to-sequence model with an encoder-decoder structure. It uses the transformer architecture to process input text and generate translations.

Training

The model was fine-tuned specifically for the task of multilingual translation. It underwent extensive multilingual pretraining and finetuning, allowing it to handle translations directly between multiple language pairs effectively.

Guide: Running Locally

To run the model locally, follow these steps:

Install the transformers library from Hugging Face.
Load the MBartForConditionalGeneration model and MBart50TokenizerFast tokenizer.
Prepare your input text in the source language and encode it using the tokenizer.
Generate the translation by passing the encoded input to the model.
Decode the generated tokens to obtain the translated text.

Example:

from transformers import MBartForConditionalGeneration, MBart50TokenizerFast

# Load model and tokenizer
model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-one-mmt")
tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-one-mmt")

# Example translation from Hindi to English
tokenizer.src_lang = "hi_IN"
encoded_input = tokenizer("संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है", return_tensors="pt")
generated_tokens = model.generate(**encoded_input)
translation = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)

print(translation)

For better performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

The model and related files are hosted on Hugging Face. For specific license details and usage rights, refer to the model's page on the Hugging Face website.

More Related APIs in Text2text Generation