Modern B E R T large ft fineweb edu annotations
mrm8488ModernBERT-large-ft-fineweb-edu-annotations
Introduction
ModernBERT-large-ft-fineweb-edu-annotations is a fine-tuned version of the ModernBERT-large model by answerdotai. It is optimized for text classification tasks and exhibits notable performance metrics on its evaluation set, including a loss of 1.1702, an F1 score of 0.7571, a precision score of 0.7609, and a recall score of 0.7554.
Architecture
The model leverages the Transformers library for its architecture and is an extension of the ModernBERT-large base model. The fine-tuning process involves a specific set of hyperparameters to optimize its performance.
Training
The model was trained with the following hyperparameters:
- Learning Rate: 8e-05
- Train Batch Size: 24
- Eval Batch Size: 24
- Seed: 42
- Optimizer: AdamW with betas=(0.9, 0.98), epsilon=1e-06
- LR Scheduler Type: Linear
- Number of Epochs: 3
Training results included a final validation loss of 1.1702 with an F1 score of 0.7571, precision of 0.7609, and recall of 0.7554.
Guide: Running Locally
To run the model locally, follow these steps:
- Clone the repository and navigate to the model directory.
- Install dependencies using pip:
pip install transformers torch datasets tokenizers
- Load the model using the Transformers library in Python:
from transformers import AutoModelForSequenceClassification, AutoTokenizer model = AutoModelForSequenceClassification.from_pretrained("mrm8488/ModernBERT-large-ft-fineweb-edu-annotations") tokenizer = AutoTokenizer.from_pretrained("mrm8488/ModernBERT-large-ft-fineweb-edu-annotations")
- Run inference or fine-tuning tasks as needed.
Cloud GPUs
For computational efficiency, consider running the model on cloud GPUs provided by services such as AWS, Google Cloud, or Azure.
License
This model is licensed under the Apache-2.0 License, permitting use, distribution, and modification under specified terms.