N V Embed v2 LLM Model — Open LLM List

Introduction

NV-Embed-v2 is a generalist embedding model that ranks No. 1 on the Massive Text Embedding Benchmark (MTEB) leaderboard as of August 30, 2024. It excels in text embedding tasks, particularly in retrieval tasks, due to its innovative designs, which include a two-staged instruction tuning method and a hard-negative mining technique.

Architecture

Base Model: Mistral-7B-v0.1
Pooling Type: Latent-Attention
Embedding Dimension: 4096

Training

NV-Embed-v2 employs a two-staged instruction tuning method to enhance accuracy in both retrieval and non-retrieval tasks. The model also uses a novel hard-negative mining method, which considers positive relevance scores to better manage false negatives.

Guide: Running Locally

Install Required Packages

pip install torch==2.2.0 transformers==4.42.4 flash-attn==2.2.0 sentence-transformers==2.7.0

Model Setup
- Load the model using either Hugging Face Transformers or Sentence-Transformers.
- For multi-GPU support, use torch.nn.DataParallel.
Embedding Queries and Passages
- Use the provided example code to encode queries and passages, applying normalization.
Troubleshooting
- Ensure the correct instruction templates are used for different MTEB sub-tasks.
- Authenticate with your Hugging Face token if access is restricted.
- Modify the Sentence Transformer package if there are discrepancies in results.
Hardware Recommendations
- Utilize cloud GPUs for optimal performance, especially for large datasets or models requiring extensive computation.

License

NV-Embed-v2 is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). It cannot be used for commercial purposes. For commercial use, consider using NVIDIA's NeMo Retriever Microservices.

More Related APIs in Feature Extraction