mt5 base mmarco v2 LLM Model

Introduction

The mt5-base-mmarco-v2 model is a variant of mT5 fine-tuned on the multilingual MS MARCO passage dataset. This dataset, Multi MS MARCO, includes passages translated into nine different languages using Google Translate. Detailed information is available in the paper mMARCO: A Multilingual Version of MS MARCO Passage Ranking Dataset and the mMARCO GitHub repository.

Architecture

The model leverages the mT5 architecture, designed for text-to-text generation tasks. It supports both PyTorch and TensorFlow frameworks, allowing for flexible deployment in various environments.

Training

The model is fine-tuned on the Multi MS MARCO dataset, which consists of translated passages from the original MS MARCO dataset. The translations were automated using Google Translate, ensuring a broad multilingual capability.

Guide: Running Locally

Install the Transformers Library:
```
pip install transformers
```

Load the Model:

from transformers import T5Tokenizer, MT5ForConditionalGeneration

model_name = 'unicamp-dl/mt5-base-mmarco-v2'
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = MT5ForConditionalGeneration.from_pretrained(model_name)

Suggest Cloud GPUs: For optimal performance, especially for large-scale tasks, consider using cloud GPU services such as AWS EC2, Google Cloud, or Azure.

License

The mt5-base-mmarco-v2 model is released under the MIT License, permitting open use, modification, and distribution with proper attribution.

More Related APIs in Text2text Generation