Dict B E R T
wyu1Introduction
DictBERT is a pre-trained language model from the ACL 2022 paper "Dict-BERT: Enhancing Language Model Pre-training with Dictionary." It improves upon the BERT model by incorporating definitions of rare words from English dictionaries, such as Wiktionary. The model's architecture is based on BERT and is trained under similar conditions.
Architecture
DictBERT leverages the BERT architecture, which is a transformer model. It enhances the traditional BERT model by integrating dictionary definitions to better handle rare words during pre-training.
Training
DictBERT is trained using the same settings as BERT, with the added use of dictionary definitions to enrich the model's understanding of rare words. In evaluations on the GLEU benchmark tasks, DictBERT consistently outperformed the standard BERT model. It achieved higher scores on metrics such as accuracy and Pearson's correlation across various tasks, such as MNLI, QNLI, QQP, SST-2, CoLA, MRPC, RTE, and STS-B.
Guide: Running Locally
- Prerequisites: Ensure you have Python and PyTorch installed on your system.
- Clone Repository: Download the code from the GitHub repository:
git clone https://github.com/wyu97/DictBERT
. - Install Dependencies: Navigate to the project directory and install dependencies using
pip install -r requirements.txt
. - Download Model: Use the Hugging Face Transformers library to download the DictBERT model.
- Fine-tuning: Follow the instructions in the repository to fine-tune DictBERT on your dataset.
- Cloud GPUs: For better performance and faster training, consider using cloud services like AWS, Google Cloud, or Azure, which offer GPU instances.
License
DictBERT is licensed under the Creative Commons Attribution 4.0 International License (cc-by-4.0). This allows for sharing and adaptation with appropriate credit.