Clinical Longformer
yikuan8Clinical-Longformer
Introduction
Clinical-Longformer is a specialized version of the Longformer model, enriched with clinical knowledge through pre-training on MIMIC-III clinical notes. It supports input sequences of up to 4,096 tokens, outperforming ClinicalBERT by at least 2% across various tasks such as named entity recognition, question answering, natural language inference, and text classification.
Architecture
Clinical-Longformer was initialized from the base version of Longformer, leveraging its architecture for handling long sequences effectively. It incorporates clinical knowledge through additional pre-training.
Training
The model was pre-trained using six 32GB Tesla V100 GPUs, utilizing FP16 precision for efficiency. The training involved 200,000 steps with a batch size of 6×3, and a learning rate of 3e-5, over a period of more than two weeks.
Guide: Running Locally
To use Clinical-Longformer locally, the following steps can be followed:
-
Install the Hugging Face Transformers library:
pip install transformers
-
Load the model and tokenizer:
from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("yikuan8/Clinical-Longformer") model = AutoModelForMaskedLM.from_pretrained("yikuan8/Clinical-Longformer")
-
For enhanced performance, especially with large inputs, consider using cloud GPUs such as NVIDIA Tesla V100 instances on platforms like AWS, Google Cloud, or Azure.
License
Please refer to the model's Hugging Face page for specific licensing details. Use in accordance with the terms provided there.