mmarco m Mini L Mv2 L12 H384 v1
cross-encoderIntroduction
The Cross-Encoder model mmarco-mMiniLMv2-L12-H384-v1
is designed for multilingual information retrieval tasks. It is based on a machine-translated version of the MS MARCO dataset, translated into 15 languages using Google Translate. The model leverages the multilingual MiniLMv2 architecture and is suitable for applications like re-ranking retrieved passages.
Architecture
The Cross-Encoder employs the multilingual MiniLMv2 model, which is a distilled version of the XLM-R Large model. It is designed to handle multiple languages efficiently, enhancing its applicability in diverse linguistic contexts.
Training
Training was conducted on the MMARCO dataset, a multilingual extension of MS MARCO. The training approach involves encoding a query alongside potential passages and ranking them based on relevance. Detailed training scripts are available in the SentenceTransformers repository.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies: Ensure that you have Python, PyTorch, and the
sentence-transformers
ortransformers
library installed. -
Load the Model with SentenceTransformers:
from sentence_transformers import CrossEncoder model = CrossEncoder('model_name') scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2'), ('Query', 'Paragraph3')])
-
Load the Model with Transformers:
from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model = AutoModelForSequenceClassification.from_pretrained('model_name') tokenizer = AutoTokenizer.from_pretrained('model_name') features = tokenizer( ['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population...', 'New York City is famous...'], padding=True, truncation=True, return_tensors="pt" ) model.eval() with torch.no_grad(): scores = model(**features).logits print(scores)
-
Cloud GPU Recommendation: For efficient processing, consider using cloud-based GPUs from providers like AWS, Google Cloud, or Azure.
License
The model is distributed under the Apache 2.0 License, allowing for broad use in both commercial and non-commercial applications.