monot5 large msmarco
castoriniIntroduction
MonoT5-Large-MSMARCO is a T5-large reranker model fine-tuned on the MS MARCO passage dataset. It is designed for document ranking tasks, leveraging the T5 architecture to improve retrieval performance.
Architecture
The model is based on the T5 (Text-to-Text Transfer Transformer) architecture, which is a versatile sequence-to-sequence model. T5 is known for its ability to handle various NLP tasks by converting all inputs and outputs to a text format. This specific model variant uses a large configuration to enhance its reranking capabilities.
Training
MonoT5-Large-MSMARCO was fine-tuned on the MS MARCO passage dataset for 100,000 steps or approximately 10 epochs. The model is specifically optimized for reranking tasks, helping in refining retrieval results by assessing the relevance of passages or documents.
Guide: Running Locally
-
Installation: Clone the repository from Hugging Face and install necessary dependencies. You may need libraries like
transformers
andtorch
. -
Model Loading: Load the MonoT5-Large-MSMARCO model using the
transformers
library. -
Inference: Use the model for reranking tasks by feeding it with the appropriate input data.
-
Example Usage: Refer to provided examples on GitHub for reranking MS MARCO passages or Robust04 documents.
-
Hardware Suggestions: To efficiently run this model, consider using cloud GPUs such as NVIDIA T4 or V100, available on platforms like AWS, Google Cloud, or Azure.
License
The model is hosted on Hugging Face by Castorini and subject to their respective licensing terms. Users should review the license details on the Hugging Face model page or accompanying documentation.