S Bluebert snli multinli stsb LLM Model

Introduction

The S-BLUEBERT-SNLI-MULTINLI-STSB is a sentence-transformers model designed to map sentences and paragraphs into a 768-dimensional dense vector space. This model is suitable for tasks such as clustering and semantic search.

Architecture

The model employs a SentenceTransformer architecture comprising:

A Transformer layer based on BertModel with a maximum sequence length of 75 and no lower casing.
A Pooling layer configured to use mean token pooling, accommodating a word embedding dimension of 768.

Training

The model was trained using:

DataLoader configured with a batch size of 64.
A CosineSimilarityLoss function.
Training parameters included 4 epochs, a learning rate of 2e-05, and a weight decay of 0.01.
The optimizer used was AdamW, and the scheduler was set to WarmupLinear.

Guide: Running Locally

Install Dependencies: Ensure you have sentence-transformers or transformers and torch installed.
```
pip install -U sentence-transformers
pip install transformers torch
```

Using Sentence-Transformers:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('pritamdeka/S-Bluebert-snli-multinli-stsb')
embeddings = model.encode(sentences)
print(embeddings)

Using Transformers:

from transformers import AutoTokenizer, AutoModel
import torch

def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0]
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)

tokenizer = AutoTokenizer.from_pretrained('pritamdeka/S-Bluebert-snli-multinli-stsb')
model = AutoModel.from_pretrained('pritamdeka/S-Bluebert-snli-multinli-stsb')

sentences = ['This is an example sentence', 'Each sentence is converted']
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

with torch.no_grad():
    model_output = model(**encoded_input)

sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print("Sentence embeddings:")
print(sentence_embeddings)

Cloud GPUs: For faster computation, consider running the model on cloud platforms offering GPU support like AWS, Google Cloud, or Azure.

License

The model and its code are provided under terms that require appropriate citation of the work by Deka and Jurek-Loughrey as outlined in their 2021 publication.

More Related APIs in Sentence Similarity