rag sequence nq

facebook

Introduction

The RAG-Sequence Model is designed for retrieval-augmented generation in knowledge-intensive NLP tasks. It integrates a question encoder, retriever, and generator to extract relevant information and generate responses to queries. This model is based on the paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks" by Patrick Lewis, Ethan Perez, Aleksandara Piktus, et al.

Architecture

The RAG-Sequence Model is an uncased model that converts all text to lowercase. It comprises three main components:

  • Question Encoder: Based on facebook/dpr-question_encoder-single-nq-base.
  • Retriever: Extracts relevant passages from the wiki_dpr dataset.
  • Generator: Based on facebook/bart-large.

These components are jointly fine-tuned on the wiki_dpr QA dataset.

Training

The training involves end-to-end fine-tuning of the question encoder, retriever, and generator using the wiki_dpr dataset. The retriever's role is crucial as it selects relevant passages from a large index, although the example usage employs a dummy retriever due to the legacy index's large memory requirement (over 75 GB of RAM).

Guide: Running Locally

To run the RAG-Sequence model locally, follow these steps:

  1. Install the Transformers Library: Ensure you have the Hugging Face Transformers library installed.

    pip install transformers
    
  2. Load the Model and Components:

    from transformers import RagTokenizer, RagRetriever, RagSequenceForGeneration
    
    tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
    retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="exact", use_dummy_dataset=True)
    model = RagSequenceForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever)
    
  3. Generate Answers:

    input_dict = tokenizer.prepare_seq2seq_batch("how many countries are in europe", return_tensors="pt")
    generated = model.generate(input_ids=input_dict["input_ids"])
    print(tokenizer.batch_decode(generated, skip_special_tokens=True)[0])
    
  4. Consider Cloud GPUs: Due to the model's size and potential performance demands, consider using cloud GPUs for more efficient processing.

License

The RAG-Sequence Model is available under the Apache 2.0 License, which allows for both personal and commercial use with proper attribution.

More Related APIs