N V Embed v2

nvidia

Introduction

NV-Embed-v2 is a generalist embedding model that ranks No. 1 on the Massive Text Embedding Benchmark (MTEB) leaderboard as of August 30, 2024. It excels in text embedding tasks, particularly in retrieval tasks, due to its innovative designs, which include a two-staged instruction tuning method and a hard-negative mining technique.

Architecture

  • Base Model: Mistral-7B-v0.1
  • Pooling Type: Latent-Attention
  • Embedding Dimension: 4096

Training

NV-Embed-v2 employs a two-staged instruction tuning method to enhance accuracy in both retrieval and non-retrieval tasks. The model also uses a novel hard-negative mining method, which considers positive relevance scores to better manage false negatives.

Guide: Running Locally

  1. Install Required Packages

    pip install torch==2.2.0 transformers==4.42.4 flash-attn==2.2.0 sentence-transformers==2.7.0
    
  2. Model Setup

    • Load the model using either Hugging Face Transformers or Sentence-Transformers.
    • For multi-GPU support, use torch.nn.DataParallel.
  3. Embedding Queries and Passages

    • Use the provided example code to encode queries and passages, applying normalization.
  4. Troubleshooting

    • Ensure the correct instruction templates are used for different MTEB sub-tasks.
    • Authenticate with your Hugging Face token if access is restricted.
    • Modify the Sentence Transformer package if there are discrepancies in results.
  5. Hardware Recommendations

    • Utilize cloud GPUs for optimal performance, especially for large datasets or models requiring extensive computation.

License

NV-Embed-v2 is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). It cannot be used for commercial purposes. For commercial use, consider using NVIDIA's NeMo Retriever Microservices.

More Related APIs in Feature Extraction