jina reranker v2 base multilingual
jinaaiIntroduction
The Jina Reranker v2 (jina-reranker-v2-base-multilingual) is a transformer-based cross-encoder model designed for text reranking tasks in information retrieval systems. It evaluates the relevance of document-query pairs and supports multiple languages. Compared to its predecessor and other models, it excels in text retrieval, multilingual capabilities, and specific tasks like text-to-SQL and code retrieval.
Architecture
The model processes input texts with a maximum context length of 1024 tokens. For longer texts, it uses a sliding window approach to chunk the inputs. Additionally, it incorporates a flash attention mechanism to enhance performance.
Training
Trained on extensive datasets of query-document pairs, the model demonstrates high accuracy in reranking tasks across different languages. It is evaluated using benchmarks such as MKQA, BEIR, and others, showcasing superior performance in various tasks.
Guide: Running Locally
Basic Steps
-
Install required libraries:
pip install transformers einops
-
Load the model:
from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained( 'jinaai/jina-reranker-v2-base-multilingual', torch_dtype="auto", trust_remote_code=True, ) model.to('cuda') # Use 'cpu' if no GPU is available model.eval()
-
Process queries and documents:
query = "Organic skincare products for sensitive skin" documents = [ "Organic skincare for sensitive skin with aloe vera and chamomile.", # ... more documents ] sentence_pairs = [[query, doc] for doc in documents] scores = model.compute_score(sentence_pairs, max_length=1024)
Cloud GPUs
For optimal performance, consider using cloud GPU services such as AWS, Azure, or Google Cloud. These platforms support advanced hardware configurations suitable for running models with flash attention requirements.
License
This model is licensed under the Creative Commons BY-NC-4.0 license, allowing for research and evaluation purposes. For commercial usage, refer to Jina AI's offerings on platforms like AWS Sagemaker or Azure Marketplace.