bert2bert base arxiv titlegen
CallidiorIntroduction
The BERT2BERT-BASE-ARXIV-TITLEGEN model is designed to generate titles for computer science papers based on their abstracts. It leverages a BERT2BERT Encoder-Decoder configuration, initialized with the official bert-base-uncased checkpoint, and has been fine-tuned on a large dataset of arXiv.org papers.
Architecture
The model employs a BERT2BERT Encoder-Decoder architecture. The encoder and decoder are both initialized with the bert-base-uncased checkpoint. This architecture is capable of generating high-quality text summaries due to its robust attention mechanisms, making it suitable for tasks like title generation from abstracts.
Training
The model was fine-tuned on a dataset of 318,500 computer science papers from arXiv.org, spanning from 2007 to 2022. It achieved a 26.3% Rouge2 F1-Score on held-out validation data, indicating its efficiency in generating relevant and concise titles.
Guide: Running Locally
To run this model locally, follow these steps:
- Setup Environment: Install the required Python libraries, including
transformers
andtorch
. - Download Model: Access the BERT2BERT-BASE-ARXIV-TITLEGEN model from Hugging Face's model hub.
- Load Model: Use the transformers library to load the model and tokenizer.
- Input and Generate: Provide the abstract of a paper as input and generate the title using the model's text generation capabilities.
For improved performance, consider using cloud-based GPUs such as those provided by Google Cloud, AWS, or Azure to accelerate the model inference process.
License
The BERT2BERT-BASE-ARXIV-TITLEGEN model is licensed under the Apache 2.0 License, allowing for broad usage and modification under its terms.