genre kilt
facebookIntroduction
GENRE (Generative ENtity REtrieval) is a system developed for autoregressive entity retrieval, using a sequence-to-sequence approach based on the fine-tuned BART architecture. It is designed to perform tasks such as entity linking by generating unique entity names based on input text, utilizing constrained beam search to ensure only valid identifiers are generated.
Architecture
The GENRE model leverages the BART architecture for sequence-to-sequence learning. It uses constrained beam search to generate entity names, ensuring the generation of only valid identifiers. Initially implemented with fairseq, the model was converted for use with Hugging Face's transformers library using a conversion script.
Training
The model was trained on the complete KILT dataset, which includes 11 datasets focused on tasks such as fact-checking, entity-linking, slot filling, dialogue, and open-domain extractive and abstractive QA.
Guide: Running Locally
To use the GENRE model locally, follow these steps:
-
Install the Transformers Library: Ensure you have the
transformers
library installed.pip install transformers
-
Download the Model: Load the tokenizer and model.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("facebook/genre-kilt") model = AutoModelForSeq2SeqLM.from_pretrained("facebook/genre-kilt").eval()
-
Generate Predictions: Input sentences to generate entity names.
sentences = ["Einstein was a German physicist."] outputs = model.generate( **tokenizer(sentences, return_tensors="pt"), num_beams=5, num_return_sequences=5, ) tokenizer.batch_decode(outputs, skip_special_tokens=True)
Optionally, use the prefix tree for constrained beam search by downloading additional resources (
trie.py
andkilt_titles_trie_dict.pkl
) and incorporating them into the process. -
Cloud GPUs: For enhanced performance, consider using cloud-based GPUs like those offered by AWS, Google Cloud, or Azure.
License
The GENRE model and code are available under the terms specified in the original repository. Users are encouraged to cite the relevant works when using the model in their projects.