Modern B E R T base embed LLM Model

Introduction

ModernBERT-base-embed is a Sentence Transformer model fine-tuned from the answerdotai/ModernBERT-base model to produce 768-dimensional dense vector embeddings. It is designed for tasks such as semantic textual similarity, semantic search, and paraphrase mining, using various datasets and multiple loss functions for training.

Architecture

The model is based on a Sentence Transformer architecture, which consists of a Transformer component handling sequences up to 8192 tokens, and a pooling layer that generates embeddings. The similarity function used is cosine similarity.

Training

ModernBERT-base-embed was trained using several datasets, including tomaarsen/natural-questions-hard-negatives, sentence-transformers/all-nli, and glue/mrpc, among others. Training involved various loss functions like MultipleNegativesRankingLoss and SoftmaxLoss, with hyperparameters such as a learning rate of 3.5e-5 and a batch size of 24.

Guide: Running Locally

Install Sentence Transformers Library:
```
pip install -U sentence-transformers
```

Load the Model:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tasksource/ModernBERT-base-embed")

Run Inference:

sentences = [
    'A chef is preparing some food',
    'A chef is preparing a meal',
    'A dog is in a sandy area with the sand that is being stirred up into the air and several plants are in the background',
]
embeddings = model.encode(sentences)
print(embeddings.shape)

Get Similarity Scores:

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)

Suggested Cloud GPUs: For optimal performance, consider using cloud services like AWS, Google Cloud, or Azure to access GPU resources.

License

The model is distributed under the Apache License 2.0, allowing for both personal and commercial usage, modification, and distribution of the software.

More Related APIs in Sentence Similarity