mt5 base mmarco v2
unicamp-dlIntroduction
The mt5-base-mmarco-v2
model is a variant of mT5 fine-tuned on the multilingual MS MARCO passage dataset. This dataset, Multi MS MARCO, includes passages translated into nine different languages using Google Translate. Detailed information is available in the paper mMARCO: A Multilingual Version of MS MARCO Passage Ranking Dataset and the mMARCO GitHub repository.
Architecture
The model leverages the mT5 architecture, designed for text-to-text generation tasks. It supports both PyTorch and TensorFlow frameworks, allowing for flexible deployment in various environments.
Training
The model is fine-tuned on the Multi MS MARCO dataset, which consists of translated passages from the original MS MARCO dataset. The translations were automated using Google Translate, ensuring a broad multilingual capability.
Guide: Running Locally
-
Install the Transformers Library:
pip install transformers
-
Load the Model:
from transformers import T5Tokenizer, MT5ForConditionalGeneration model_name = 'unicamp-dl/mt5-base-mmarco-v2' tokenizer = T5Tokenizer.from_pretrained(model_name) model = MT5ForConditionalGeneration.from_pretrained(model_name)
-
Suggest Cloud GPUs: For optimal performance, especially for large-scale tasks, consider using cloud GPU services such as AWS EC2, Google Cloud, or Azure.
License
The mt5-base-mmarco-v2
model is released under the MIT License, permitting open use, modification, and distribution with proper attribution.