pubmedbert base embeddings 1 M

NeuML

Introduction

The PubMedBERT-Base-Embeddings-1M model is a pruned version of the original PubMedBERT Embeddings 2M, designed to use the top 50% most frequently used tokens. It is suitable for tasks such as semantic search and retrieval augmented generation (RAG).

Architecture

This model is based on the sentence-transformers and Model2Vec architectures. It uses a static embedding approach, storing vectors at int16 precision, which benefits smaller or low-powered devices.

Training

The model underwent vocabulary pruning to reduce size while retaining performance. This process involved tokenizing datasets, calculating token weights, and applying PCA for dimensionality reduction. The final embeddings are re-weighted, normalized, and stored in a more compact form.

Guide: Running Locally

  1. Install Required Libraries: Ensure you have txtai, sentence-transformers, and Model2Vec installed.

  2. Load the Model:

    • Using txtai:
      import txtai
      embeddings = txtai.Embeddings(path="neuml/pubmedbert-base-embeddings-1M", content=True)
      
    • Using sentence-transformers:
      from sentence_transformers import SentenceTransformer
      from sentence_transformers.models import StaticEmbedding
      static = StaticEmbedding.from_model2vec("neuml/pubmedbert-base-embeddings-1M")
      model = SentenceTransformer(modules=[static])
      
    • Using Model2Vec:
      from model2vec import StaticModel
      model = StaticModel.from_pretrained("neuml/pubmedbert-base-embeddings-1M")
      
  3. Run Inference: Encode sentences or index documents using the loaded model.

  4. Cloud GPUs: For optimal performance, especially during indexing, it is recommended to utilize cloud GPUs such as NVIDIA RTX 3090.

License

This model is licensed under the Apache-2.0 License, allowing for broad use and distribution.

More Related APIs in Sentence Similarity