pubmedbert base embeddings LLM Model

Introduction

PubMedBERT Embeddings is a model fine-tuned from the PubMedBERT-base for mapping sentences and paragraphs to a 768-dimensional dense vector space. It is tailored for tasks such as clustering or semantic search, particularly in medical literature. The model is fine-tuned using sentence-transformers and trained on a dataset comprising PubMed title-abstract pairs.

Architecture

The model architecture includes a SentenceTransformer with the following components:

Transformer Model: BertModel with a maximum sequence length of 512 and case sensitivity set to false.
Pooling Layer: Configured for mean pooling over token embeddings, which aggregates the token embeddings into a single sentence embedding.

Training

The model was trained using the following parameters:

DataLoader: Batch size of 24 with a random sampler.
Loss Function: MultipleNegativesRankingLoss with a scale of 20.0 and cosine similarity.
Training Parameters: 1 epoch, evaluation every 500 steps, and an optimizer using AdamW with a learning rate of 2e-05. The scheduler used is WarmupLinear with a warmup of 10,000 steps and a weight decay of 0.01.

Guide: Running Locally

Steps

Install Required Libraries:
- txtai
- sentence-transformers
- transformers
- torch

Using txtai:

import txtai
embeddings = txtai.Embeddings(path="neuml/pubmedbert-base-embeddings", content=True)
embeddings.index(documents())
results = embeddings.search("query to run")

Using SentenceTransformers:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("neuml/pubmedbert-base-embeddings")
embeddings = model.encode(["This is an example sentence", "Each sentence is converted"])

Using Hugging Face Transformers:

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("neuml/pubmedbert-base-embeddings")
model = AutoModel.from_pretrained("neuml/pubmedbert-base-embeddings")
inputs = tokenizer(['This is an example sentence', 'Each sentence is converted'], padding=True, truncation=True, return_tensors='pt')

with torch.no_grad():
    output = model(**inputs)

def meanpooling(output, mask):
    embeddings = output[0]
    mask = mask.unsqueeze(-1).expand(embeddings.size()).float()
    return torch.sum(embeddings * mask, 1) / torch.clamp(mask.sum(1), min=1e-9)

embeddings = meanpooling(output, inputs['attention_mask'])

Cloud GPUs

Consider using cloud GPU services like AWS EC2, Google Cloud, or Azure for efficient processing, especially for large datasets.

License

This model is licensed under the Apache-2.0 License.

More Related APIs in Sentence Similarity