ms marco Mini L M L 6 v2 LLM Model

Introduction

The Cross-Encoder model MS-MARCO-MiniLM-L-6-V2 is designed for information retrieval tasks, specifically trained on the MS MARCO Passage Ranking dataset. It encodes queries along with possible passages and ranks them for relevance. The model supports usage with the Transformers and SentenceTransformers libraries.

Architecture

The Cross-Encoder architecture is based on transformer models, specifically optimized for sequence classification tasks. The MS-MARCO-MiniLM-L-6-V2 variant is one of several models with varying performance metrics trained for similar tasks.

Training

The model was trained using the MS MARCO Passage Ranking dataset, employing techniques for effective information retrieval. Training resources and techniques can be found on the SentenceTransformers GitHub, detailing methods used for passage ranking and retrieval.

Guide: Running Locally

Install Required Libraries:
- Ensure you have transformers and torch installed for the Transformers library.
- For SentenceTransformers, install the library directly.

Load the Model:

Use the following code snippet to load the model with Transformers:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('cross-encoder/ms-marco-MiniLM-L-6-v2')
tokenizer = AutoTokenizer.from_pretrained('cross-encoder/ms-marco-MiniLM-L-6-v2')

Make Predictions:

Prepare your input data and use the model to compute scores:

features = tokenizer(['Query1', 'Query2'], ['Passage1', 'Passage2'], padding=True, truncation=True, return_tensors="pt")
model.eval()
with torch.no_grad():
    scores = model(**features).logits
    print(scores)

Use SentenceTransformers:

Alternatively, load the model using SentenceTransformers:

from sentence_transformers import CrossEncoder
model = CrossEncoder('cross-encoder/ms-marco-MiniLM-L-6-v2', max_length=512)
scores = model.predict([('Query', 'Passage1'), ('Query', 'Passage2')])

Consider Cloud GPUs:
- For performance optimization, especially on larger datasets, consider using cloud-based GPUs such as NVIDIA V100, which are suitable for inference tasks.

License

The model is licensed under the Apache-2.0 License, allowing for both personal and commercial use, with attribution.

More Related APIs in Text Classification