clinical assertion negation bert
bvanakenIntroduction
The Clinical Assertion and Negation Classification BERT model is designed to structure information in clinical patient notes by classifying medical conditions into three categories: PRESENT, ABSENT, and POSSIBLE. This model builds upon the ClinicalBERT - Bio + Discharge Summary BERT Model and is fine-tuned using assertion data from the 2010 i2b2 challenge.
Architecture
The model uses the BERT architecture, specifically the ClinicalBERT variant, which is pre-trained on a combination of biomedical literature and discharge summary data. It has been further fine-tuned to detect assertions in clinical texts. The input is expected in the form of sentences with one marked entity to classify using special [entity]
tokens.
Training
The model was fine-tuned on assertion data from the 2010 i2b2 challenge. This dataset provides examples of clinical assertions, which the model learns to classify as PRESENT, ABSENT, or POSSIBLE.
Guide: Running Locally
To run the Clinical Assertion and Negation Classification BERT model locally:
-
Install the
transformers
library if you haven't already:pip install transformers
-
Load the model and tokenizer:
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline tokenizer = AutoTokenizer.from_pretrained("bvanaken/clinical-assertion-negation-bert") model = AutoModelForSequenceClassification.from_pretrained("bvanaken/clinical-assertion-negation-bert")
-
Create a text classification pipeline:
classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer)
-
Classify a sample input:
input = "The patient recovered during the night and now denies any [entity] shortness of breath [entity]." classification = classifier(input) print(classification) # Output: [{'label': 'ABSENT', 'score': 0.9842607378959656}]
For optimal performance, using cloud GPUs from providers such as AWS, Google Cloud, or Azure is recommended.
License
The model is hosted on Hugging Face and is available under the terms specified on its model card page. Always refer to the model's page for the most current license information.