I C D 10 Code Prediction

AkshatSurolia

Introduction

The ICD-10 Code Prediction model leverages Clinical BERT embeddings to predict ICD-10 codes from clinical text. This model is designed for text classification tasks within the medical domain, particularly for processing clinical notes and discharge summaries.

Architecture

The model is based on the BERT architecture, specifically using Clinical BERT, which is initialized with either BERT-Base or BioBERT, and trained on MIMIC clinical notes or discharge summaries. This configuration makes it suitable for handling medical terminology and context effectively.

Training

The model utilizes publicly available Clinical BERT embeddings, trained on extensive medical datasets, such as MIMIC notes. The training process involves fine-tuning the model to recognize and classify medical diagnoses into ICD-10 codes.

Guide: Running Locally

To use the ICD-10 Code Prediction model locally, follow these steps:

  1. Install Dependencies: Ensure you have PyTorch and the Transformers library installed.

    pip install torch transformers
    
  2. Load the Model: Use the Transformers library to load the model and tokenizer.

    from transformers import AutoTokenizer, BertForSequenceClassification
    tokenizer = AutoTokenizer.from_pretrained("AkshatSurolia/ICD-10-Code-Prediction")
    model = BertForSequenceClassification.from_pretrained("AkshatSurolia/ICD-10-Code-Prediction")
    
  3. Run Inference: Prepare your clinical text and obtain predictions.

    text = "subarachnoid hemorrhage scalp laceration service: surgery major surgical or invasive"
    encoded_input = tokenizer(text, return_tensors='pt')
    output = model(**encoded_input)
    
  4. Get Predictions: Extract and return the top-5 predicted ICD-10 codes.

    results = output.logits.detach().cpu().numpy()[0].argsort()[::-1][:5]
    predictions = [model.config.id2label[ids] for ids in results]
    

Cloud GPUs: For enhanced performance, consider using cloud-based GPU services from providers like AWS, Google Cloud, or Azure.

License

The ICD-10 Code Prediction model is licensed under the Apache 2.0 License, which allows for both personal and commercial use.

More Related APIs in Text Classification