inf retriever v1
inflyIntroduction
INF-Retriever-v1 is a dense retrieval model built on the gte-Qwen2-7B-instruct model, fine-tuned for optimizing retrieval tasks in both Chinese and English. It is developed by INF TECH and excels in heterogeneous information retrieval tasks, ranking No.1 on the AIR-Bench 24.04 as of December 23, 2024.
Architecture
INF-Retriever-v1 leverages the capabilities of the gte-Qwen2-7B-instruct model, optimized specifically for retrieval tasks. It supports sentence-transformers and transformers frameworks, allowing efficient processing of both English and Chinese text data.
Training
The model has been fine-tuned using retrieval-focused datasets, ensuring high performance and accuracy in retrieving relevant information from large corpora.
Guide: Running Locally
Basic Steps
- Install Dependencies: Ensure Python and libraries such as
transformers
,torch
, andsentence_transformers
are installed. - Load the Model:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("infly/inf-retriever-v1", trust_remote_code=True)
- Process Queries and Documents:
- Use the model's
encode
method for embeddings. - Compute similarity scores using dot products of embeddings.
- Use the model's
- Alternative with Transformers:
import torch from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained('infly/inf-retriever-v1', trust_remote_code=True) model = AutoModel.from_pretrained('infly/inf-retriever-v1', trust_remote_code=True)
Suggest Cloud GPUs
Consider using cloud GPUs from providers like AWS, Google Cloud, or Azure for efficient processing, especially when handling large datasets.
License
The usage license for INF-Retriever-v1 is not specified in the provided documentation. Please refer to the Hugging Face model card for detailed licensing information.