contriever LLM Model — Open LLM List

Introduction

The CONTRIEVER model is designed for unsupervised dense information retrieval using contrastive learning. It is developed by Facebook AI and is available on Hugging Face's platform. The method is detailed in the paper "Towards Unsupervised Dense Information Retrieval with Contrastive Learning."

Architecture

CONTRIEVER is part of the Transformers library and is implemented using PyTorch. It requires a mean pooling operation to produce sentence embeddings from token embeddings.

Training

The model was trained using unsupervised contrastive learning techniques. Further details on the training process can be found in the associated arXiv paper.

Guide: Running Locally

Install Dependencies: Ensure you have PyTorch and the Transformers library installed.
```
pip install torch transformers
```

Load Model and Tokenizer:

import torch
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('facebook/contriever')
model = AutoModel.from_pretrained('facebook/contriever')

Prepare Data:

sentences = [
    "Where was Marie Curie born?",
    "Maria Sklodowska, later known as Marie Curie, was born on November 7, 1867.",
    "Born in Paris on 15 May 1859, Pierre Curie was the son of Eugène Curie."
]
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

Compute Embeddings:

outputs = model(**inputs)

def mean_pooling(token_embeddings, mask):
    token_embeddings = token_embeddings.masked_fill(~mask[..., None].bool(), 0.)
    sentence_embeddings = token_embeddings.sum(dim=1) / mask.sum(dim=1)[..., None]
    return sentence_embeddings

embeddings = mean_pooling(outputs[0], inputs['attention_mask'])

Cloud GPUs

For enhanced performance, consider using cloud GPU services such as Amazon Web Services (AWS), Google Cloud Platform (GCP), or Azure.

License

Refer to the GitHub repository for licensing details related to the CONTRIEVER model.

More Related APIs