bge small en v1.5
BAAIIntroduction
The BGE-Small-EN-V1.5 model is a part of the FlagEmbedding project led by the Beijing Academy of Artificial Intelligence (BAAI). This model is designed for tasks such as feature extraction, sentence similarity, and text embeddings inference, and supports multiple platforms including PyTorch, ONNX, and Transformers.
Architecture
The BGE-Small-EN-V1.5 model is based on the BERT architecture and is part of the broader BGE model series. It is configured for sentence-transformers and feature extraction tasks, emphasizing retrieval and sentence similarity.
Training
The BGE models are pre-trained using RetroMAE and fine-tuned with contrastive learning on large-scale paired data to optimize for retrieval tasks. Fine-tuning examples and scripts are available for users to adapt the models to their own datasets.
Guide: Running Locally
-
Installation:
- Install required packages using pip:
pip install -U FlagEmbedding sentence-transformers
- Install required packages using pip:
-
Model Loading:
- Use the model with FlagEmbedding or Sentence-Transformers:
from FlagEmbedding import FlagModel model = FlagModel('BAAI/bge-small-en-v1.5')
- Use the model with FlagEmbedding or Sentence-Transformers:
-
Inference:
- Encode sentences to get embeddings:
sentences = ["Example sentence 1", "Example sentence 2"] embeddings = model.encode(sentences)
- Encode sentences to get embeddings:
-
GPU Usage:
- For enhanced performance, utilize cloud GPUs from providers such as AWS, Google Cloud, or Azure.
License
FlagEmbedding is released under the MIT License, allowing for free use, modification, and distribution for commercial purposes. The full license text is available in the project's repository.