pubmedbert base embeddings 1 M
NeuMLIntroduction
The PubMedBERT-Base-Embeddings-1M model is a pruned version of the original PubMedBERT Embeddings 2M, designed to use the top 50% most frequently used tokens. It is suitable for tasks such as semantic search and retrieval augmented generation (RAG).
Architecture
This model is based on the sentence-transformers and Model2Vec architectures. It uses a static embedding approach, storing vectors at int16 precision, which benefits smaller or low-powered devices.
Training
The model underwent vocabulary pruning to reduce size while retaining performance. This process involved tokenizing datasets, calculating token weights, and applying PCA for dimensionality reduction. The final embeddings are re-weighted, normalized, and stored in a more compact form.
Guide: Running Locally
-
Install Required Libraries: Ensure you have
txtai
,sentence-transformers
, andModel2Vec
installed. -
Load the Model:
- Using
txtai
:import txtai embeddings = txtai.Embeddings(path="neuml/pubmedbert-base-embeddings-1M", content=True)
- Using
sentence-transformers
:from sentence_transformers import SentenceTransformer from sentence_transformers.models import StaticEmbedding static = StaticEmbedding.from_model2vec("neuml/pubmedbert-base-embeddings-1M") model = SentenceTransformer(modules=[static])
- Using
Model2Vec
:from model2vec import StaticModel model = StaticModel.from_pretrained("neuml/pubmedbert-base-embeddings-1M")
- Using
-
Run Inference: Encode sentences or index documents using the loaded model.
-
Cloud GPUs: For optimal performance, especially during indexing, it is recommended to utilize cloud GPUs such as NVIDIA RTX 3090.
License
This model is licensed under the Apache-2.0 License, allowing for broad use and distribution.