clinical assertion negation bert

bvanaken

Introduction

The Clinical Assertion and Negation Classification BERT model is designed to structure information in clinical patient notes by classifying medical conditions into three categories: PRESENT, ABSENT, and POSSIBLE. This model builds upon the ClinicalBERT - Bio + Discharge Summary BERT Model and is fine-tuned using assertion data from the 2010 i2b2 challenge.

Architecture

The model uses the BERT architecture, specifically the ClinicalBERT variant, which is pre-trained on a combination of biomedical literature and discharge summary data. It has been further fine-tuned to detect assertions in clinical texts. The input is expected in the form of sentences with one marked entity to classify using special [entity] tokens.

Training

The model was fine-tuned on assertion data from the 2010 i2b2 challenge. This dataset provides examples of clinical assertions, which the model learns to classify as PRESENT, ABSENT, or POSSIBLE.

Guide: Running Locally

To run the Clinical Assertion and Negation Classification BERT model locally:

  1. Install the transformers library if you haven't already:

    pip install transformers
    
  2. Load the model and tokenizer:

    from transformers import AutoTokenizer, AutoModelForSequenceClassification, TextClassificationPipeline
    
    tokenizer = AutoTokenizer.from_pretrained("bvanaken/clinical-assertion-negation-bert")
    model = AutoModelForSequenceClassification.from_pretrained("bvanaken/clinical-assertion-negation-bert")
    
  3. Create a text classification pipeline:

    classifier = TextClassificationPipeline(model=model, tokenizer=tokenizer)
    
  4. Classify a sample input:

    input = "The patient recovered during the night and now denies any [entity] shortness of breath [entity]."
    classification = classifier(input)
    print(classification)
    # Output: [{'label': 'ABSENT', 'score': 0.9842607378959656}]
    

For optimal performance, using cloud GPUs from providers such as AWS, Google Cloud, or Azure is recommended.

License

The model is hosted on Hugging Face and is available under the terms specified on its model card page. Always refer to the model's page for the most current license information.

More Related APIs in Text Classification