Medical N E R
blaze999Introduction
DEBERTA-MED-NER-2 is a fine-tuned model based on DeBERTa, specifically tailored to recognize 41 medical entities. It has been trained on the PubMED dataset to enhance its performance in medical named entity recognition (NER) tasks.
Architecture
The model is built upon the microsoft/deberta-v3-base
architecture, utilizing the transformer-based approach for token classification. This allows the model to effectively discern and classify medical entities within text.
Training
During the training process, the following hyperparameters were employed:
- Learning Rate: 2e-05
- Train Batch Size: 8
- Eval Batch Size: 16
- Seed: 42
- Gradient Accumulation Steps: 2
- Total Train Batch Size: 16
- Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- LR Scheduler Type: Cosine
- LR Scheduler Warmup Ratio: 0.1
- Num Epochs: 30
- Mixed Precision Training: Native AMP
Guide: Running Locally
To run the model locally, you can either use the Hugging Face inference API or the Transformers library pipeline:
-
Using the Pipeline:
from transformers import pipeline pipe = pipeline("token-classification", model="Clinical-AI-Apollo/Medical-NER", aggregation_strategy='simple') result = pipe('45 year old woman diagnosed with CAD')
-
Loading the Model Directly:
from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("Clinical-AI-Apollo/Medical-NER") model = AutoModelForTokenClassification.from_pretrained("Clinical-AI-Apollo/Medical-NER")
For optimal performance, it is recommended to use cloud GPUs such as those offered by AWS, Google Cloud, or Azure.
License
The DEBERTA-MED-NER-2 model is available under the MIT License, allowing for broad use and modification.