N V Embed v2
nvidiaIntroduction
NV-Embed-v2 is a generalist embedding model that ranks No. 1 on the Massive Text Embedding Benchmark (MTEB) leaderboard as of August 30, 2024. It excels in text embedding tasks, particularly in retrieval tasks, due to its innovative designs, which include a two-staged instruction tuning method and a hard-negative mining technique.
Architecture
- Base Model: Mistral-7B-v0.1
- Pooling Type: Latent-Attention
- Embedding Dimension: 4096
Training
NV-Embed-v2 employs a two-staged instruction tuning method to enhance accuracy in both retrieval and non-retrieval tasks. The model also uses a novel hard-negative mining method, which considers positive relevance scores to better manage false negatives.
Guide: Running Locally
-
Install Required Packages
pip install torch==2.2.0 transformers==4.42.4 flash-attn==2.2.0 sentence-transformers==2.7.0
-
Model Setup
- Load the model using either Hugging Face Transformers or Sentence-Transformers.
- For multi-GPU support, use
torch.nn.DataParallel
.
-
Embedding Queries and Passages
- Use the provided example code to encode queries and passages, applying normalization.
-
Troubleshooting
- Ensure the correct instruction templates are used for different MTEB sub-tasks.
- Authenticate with your Hugging Face token if access is restricted.
- Modify the Sentence Transformer package if there are discrepancies in results.
-
Hardware Recommendations
- Utilize cloud GPUs for optimal performance, especially for large datasets or models requiring extensive computation.
License
NV-Embed-v2 is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). It cannot be used for commercial purposes. For commercial use, consider using NVIDIA's NeMo Retriever Microservices.