distilbert base nli mean tokens

sentence-transformers

Introduction

The distilbert-base-nli-mean-tokens model is part of the Sentence Transformers library. It maps sentences and paragraphs to a 768-dimensional dense vector space, useful for tasks such as clustering and semantic search. However, this model is deprecated due to its low-quality sentence embeddings, and alternative models are recommended at SBERT.net.

Architecture

The model architecture consists of a Transformer component based on DistilBertModel with a maximum sequence length of 128, followed by a pooling layer that uses mean pooling of token embeddings to generate sentence embeddings.

Training

This model was trained by the Sentence Transformers team as described in their publication "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks". The training method involved using a Siamese network structure to produce sentence embeddings tailored for semantic similarity tasks.

Guide: Running Locally

To run the distilbert-base-nli-mean-tokens model locally, follow these steps:

  1. Install Sentence Transformers:

    pip install -U sentence-transformers
    
  2. Load and Use the Model:

    from sentence_transformers import SentenceTransformer
    sentences = ["This is an example sentence", "Each sentence is converted"]
    model = SentenceTransformer('sentence-transformers/distilbert-base-nli-mean-tokens')
    embeddings = model.encode(sentences)
    print(embeddings)
    
  3. Alternative Using Hugging Face Transformers:

    from transformers import AutoTokenizer, AutoModel
    import torch
    
    def mean_pooling(model_output, attention_mask):
        token_embeddings = model_output[0]
        input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
        return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
    
    sentences = ['This is an example sentence', 'Each sentence is converted']
    tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/distilbert-base-nli-mean-tokens')
    model = AutoModel.from_pretrained('sentence-transformers/distilbert-base-nli-mean-tokens')
    
    encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
    with torch.no_grad():
        model_output = model(**encoded_input)
    
    sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
    print("Sentence embeddings:")
    print(sentence_embeddings)
    

For better performance, consider using cloud GPUs like those offered by AWS, GCP, or Azure.

License

The model is released under the Apache 2.0 License. Use of the model should adhere to the terms outlined in this license.

More Related APIs in Feature Extraction