Clinical-Longformer

Introduction

Clinical-Longformer is a specialized version of the Longformer model, enriched with clinical knowledge through pre-training on MIMIC-III clinical notes. It supports input sequences of up to 4,096 tokens, outperforming ClinicalBERT by at least 2% across various tasks such as named entity recognition, question answering, natural language inference, and text classification.

Architecture

Clinical-Longformer was initialized from the base version of Longformer, leveraging its architecture for handling long sequences effectively. It incorporates clinical knowledge through additional pre-training.

Training

The model was pre-trained using six 32GB Tesla V100 GPUs, utilizing FP16 precision for efficiency. The training involved 200,000 steps with a batch size of 6×3, and a learning rate of 3e-5, over a period of more than two weeks.

Guide: Running Locally

To use Clinical-Longformer locally, the following steps can be followed:

Install the Hugging Face Transformers library:
```
pip install transformers
```

Load the model and tokenizer:

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("yikuan8/Clinical-Longformer")
model = AutoModelForMaskedLM.from_pretrained("yikuan8/Clinical-Longformer")

For enhanced performance, especially with large inputs, consider using cloud GPUs such as NVIDIA Tesla V100 instances on platforms like AWS, Google Cloud, or Azure.

License

Please refer to the model's Hugging Face page for specific licensing details. Use in accordance with the terms provided there.