facebook dpr ctx_encoder multiset base
sentence-transformersIntroduction
The facebook-dpr-ctx_encoder-multiset-base
is a model from Sentence Transformers based on Facebook's DPR, designed for mapping sentences and paragraphs into a 768-dimensional dense vector space. This model can be used for tasks such as clustering and semantic search.
Architecture
The model architecture consists of two main components:
- Transformer: Uses a BertModel with a maximum sequence length of 509 and does not convert text to lowercase.
- Pooling: Supports CLS token pooling with a word embedding dimension of 768.
Training
This model is a port of the Dense Passage Retrieval (DPR) model by Facebook Research, tailored for sentence-transformers usage. It is designed to generate high-quality sentence embeddings for various NLP tasks.
Guide: Running Locally
Basic Steps
-
Install Sentence Transformers:
pip install -U sentence-transformers
-
Using the Model:
from sentence_transformers import SentenceTransformer sentences = ["This is an example sentence", "Each sentence is converted"] model = SentenceTransformer('sentence-transformers/facebook-dpr-ctx_encoder-multiset-base') embeddings = model.encode(sentences) print(embeddings)
-
Using Hugging Face Transformers:
from transformers import AutoTokenizer, AutoModel import torch def cls_pooling(model_output, attention_mask): return model_output[0][:,0] sentences = ['This is an example sentence', 'Each sentence is converted'] tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/facebook-dpr-ctx_encoder-multiset-base') model = AutoModel.from_pretrained('sentence-transformers/facebook-dpr-ctx_encoder-multiset-base') encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt') with torch.no_grad(): model_output = model(**encoded_input) sentence_embeddings = cls_pooling(model_output, encoded_input['attention_mask']) print("Sentence embeddings:") print(sentence_embeddings)
Cloud GPUs
For more efficient computation, consider utilizing cloud GPU services such as AWS, Google Cloud, or Azure for running the model.
License
The model is distributed under the Apache 2.0 License.