Modern B E R T base ft fineweb edu annotations
mrm8488Introduction
MODERNBERT-BASE-FT-FINEWEB-EDU-ANNOTATIONS is a fine-tuned version of the ModernBERT-base model, designed for text classification tasks. It utilizes advanced transformer techniques and offers metrics such as F1 Score, Precision, and Recall. The model operates under an Apache-2.0 license.
Architecture
The model is based on the ModernBERT architecture, which integrates transformer libraries to optimize performance for text classification. It leverages safetensors and tensorboard for model management and tracking. The model is compatible with inference endpoints, making it suitable for deployment in various environments.
Training
The training process employed the following hyperparameters:
- Learning Rate: 8e-05
- Train Batch Size: 32
- Eval Batch Size: 32
- Seed: 42
- Optimizer: AdamW with betas=(0.9, 0.98), epsilon=1e-06
- LR Scheduler Type: Linear
- Number of Epochs: 3
Training results showed a loss of 1.1047, an F1 Score of 0.7565, a Precision Score of 0.7603, and a Recall Score of 0.7545. Frameworks used include Transformers 4.48.0.dev0, PyTorch 2.5.1+cu121, Datasets 3.2.0, and Tokenizers 0.21.0.
Guide: Running Locally
To run the model locally, follow these steps:
- Install the necessary libraries: Transformers, PyTorch, Datasets, and Tokenizers.
- Download the model from the Hugging Face repository.
- Load the model using the Transformers library.
- Prepare your dataset and run inference.
For optimal performance, consider using cloud GPUs such as those offered by AWS, GCP, or Azure, which can handle the computational demands of the model.
License
This model is released under the Apache-2.0 license, allowing for flexible use and distribution.