monot5 large msmarco

castorini

Introduction

MonoT5-Large-MSMARCO is a T5-large reranker model fine-tuned on the MS MARCO passage dataset. It is designed for document ranking tasks, leveraging the T5 architecture to improve retrieval performance.

Architecture

The model is based on the T5 (Text-to-Text Transfer Transformer) architecture, which is a versatile sequence-to-sequence model. T5 is known for its ability to handle various NLP tasks by converting all inputs and outputs to a text format. This specific model variant uses a large configuration to enhance its reranking capabilities.

Training

MonoT5-Large-MSMARCO was fine-tuned on the MS MARCO passage dataset for 100,000 steps or approximately 10 epochs. The model is specifically optimized for reranking tasks, helping in refining retrieval results by assessing the relevance of passages or documents.

Guide: Running Locally

  1. Installation: Clone the repository from Hugging Face and install necessary dependencies. You may need libraries like transformers and torch.

  2. Model Loading: Load the MonoT5-Large-MSMARCO model using the transformers library.

  3. Inference: Use the model for reranking tasks by feeding it with the appropriate input data.

  4. Example Usage: Refer to provided examples on GitHub for reranking MS MARCO passages or Robust04 documents.

  5. Hardware Suggestions: To efficiently run this model, consider using cloud GPUs such as NVIDIA T4 or V100, available on platforms like AWS, Google Cloud, or Azure.

License

The model is hosted on Hugging Face by Castorini and subject to their respective licensing terms. Users should review the license details on the Hugging Face model page or accompanying documentation.

More Related APIs in Feature Extraction