extended_distil B E R T finetuned resumes sections
has-abiIntroduction
The extended_distilBERT-finetuned-resumes-sections
model is a fine-tuned version of the Geotrend/distilbert-base-en-fr-cased
model. It is specifically optimized for text classification tasks, achieving high performance on evaluation metrics such as F1 score, Roc AUC, and accuracy.
Architecture
The model is based on the DistilBERT architecture, which is a smaller, faster, and lighter version of BERT. DistilBERT retains 97% of BERT's language understanding while being 60% faster and 40% lighter.
Training
Training Hyperparameters
- Learning Rate: 2e-05
- Train Batch Size: 8
- Eval Batch Size: 8
- Seed: 42
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- LR Scheduler Type: Linear
- Num Epochs: 20
Training Results
The model was trained over 20 epochs, achieving the following results:
- Final Training Loss: 0.0321
- Validation Loss: 0.0334
- F1 Score: 0.9735
- Roc AUC: 0.9850
- Accuracy: 0.9715
Guide: Running Locally
To run the model locally, follow these steps:
- Clone the Repository: Download the model from its repository.
- Install Dependencies: Ensure the following libraries are installed:
- Transformers 4.21.3
- PyTorch 1.12.1+cu113
- Datasets 2.4.0
- Tokenizers 0.12.1
- Load the Model: Use the Transformers library to load the fine-tuned model.
- Inference: Prepare your data and run inference using the model.
For optimal performance, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure.
License
The model is licensed under the Apache-2.0 license, allowing for both personal and commercial use with proper attribution and compliance with the license terms.