snowflake arctic embed m
SnowflakeIntroduction
The Snowflake Arctic Embed models are a suite of high-quality text embedding models optimized for retrieval tasks. They achieve state-of-the-art performance on various benchmarks and are designed to replace closed-source embeddings.
Architecture
The models are based on adaptations of existing models like bert-base-uncased
and intfloat/e5-base-unsupervised
, fine-tuned for improved retrieval performance. They come in multiple sizes, from small to large, catering to different performance needs and constraints. Each model variant offers a balance between size and retrieval accuracy.
Training
The training process involves a multi-stage pipeline. Initially, models are trained with large batches of query-document pairs, followed by further optimization using a smaller dataset. This dataset consists of triplets of query, positive, and negative documents, where the negatives are derived through hard mining techniques. This approach enhances retrieval accuracy by focusing on difficult examples.
Guide: Running Locally
-
Setup Environment: Install the necessary software, such as Python, Hugging Face's
transformers
orsentence-transformers
, and if using JavaScript,transformers.js
. -
Install Model: Use Hugging Face's
transformers
orsentence-transformers
to load the model:from sentence_transformers import SentenceTransformer model = SentenceTransformer("Snowflake/snowflake-arctic-embed-m")
-
Prepare Data: Tokenize and prepare your input data (queries and documents) using the appropriate tokenizer.
-
Compute Embeddings: Generate embeddings for your data using the model.
-
Compute Similarities: Use dot products or cosine similarity to compare query embeddings against document embeddings.
-
Inference: Deploy locally using Docker and OpenAI compatible API with Infinity, which can leverage cloud GPUs for accelerated performance:
docker run --gpus all ...
Suggest Cloud GPUs
- Consider using cloud services like AWS, Google Cloud, or Azure for GPU-accelerated performance.
License
The Snowflake Arctic Embed models are released under the Apache License 2.0, allowing for commercial use free of charge.