extended_distil B E R T finetuned resumes sections

has-abi

Introduction

The extended_distilBERT-finetuned-resumes-sections model is a fine-tuned version of the Geotrend/distilbert-base-en-fr-cased model. It is specifically optimized for text classification tasks, achieving high performance on evaluation metrics such as F1 score, Roc AUC, and accuracy.

Architecture

The model is based on the DistilBERT architecture, which is a smaller, faster, and lighter version of BERT. DistilBERT retains 97% of BERT's language understanding while being 60% faster and 40% lighter.

Training

Training Hyperparameters

  • Learning Rate: 2e-05
  • Train Batch Size: 8
  • Eval Batch Size: 8
  • Seed: 42
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • LR Scheduler Type: Linear
  • Num Epochs: 20

Training Results

The model was trained over 20 epochs, achieving the following results:

  • Final Training Loss: 0.0321
  • Validation Loss: 0.0334
  • F1 Score: 0.9735
  • Roc AUC: 0.9850
  • Accuracy: 0.9715

Guide: Running Locally

To run the model locally, follow these steps:

  1. Clone the Repository: Download the model from its repository.
  2. Install Dependencies: Ensure the following libraries are installed:
    • Transformers 4.21.3
    • PyTorch 1.12.1+cu113
    • Datasets 2.4.0
    • Tokenizers 0.12.1
  3. Load the Model: Use the Transformers library to load the fine-tuned model.
  4. Inference: Prepare your data and run inference using the model.

For optimal performance, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure.

License

The model is licensed under the Apache-2.0 license, allowing for both personal and commercial use with proper attribution and compliance with the license terms.

More Related APIs in Text Classification