ModernBERT-large-ft-fineweb-edu-annotations

Introduction

ModernBERT-large-ft-fineweb-edu-annotations is a fine-tuned version of the ModernBERT-large model by answerdotai. It is optimized for text classification tasks and exhibits notable performance metrics on its evaluation set, including a loss of 1.1702, an F1 score of 0.7571, a precision score of 0.7609, and a recall score of 0.7554.

Architecture

The model leverages the Transformers library for its architecture and is an extension of the ModernBERT-large base model. The fine-tuning process involves a specific set of hyperparameters to optimize its performance.

Training

The model was trained with the following hyperparameters:

Learning Rate: 8e-05
Train Batch Size: 24
Eval Batch Size: 24
Seed: 42
Optimizer: AdamW with betas=(0.9, 0.98), epsilon=1e-06
LR Scheduler Type: Linear
Number of Epochs: 3

Training results included a final validation loss of 1.1702 with an F1 score of 0.7571, precision of 0.7609, and recall of 0.7554.

Guide: Running Locally

To run the model locally, follow these steps:

Clone the repository and navigate to the model directory.

Install dependencies using pip:

pip install transformers torch datasets tokenizers

Load the model using the Transformers library in Python:

from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("mrm8488/ModernBERT-large-ft-fineweb-edu-annotations")
tokenizer = AutoTokenizer.from_pretrained("mrm8488/ModernBERT-large-ft-fineweb-edu-annotations")

Run inference or fine-tuning tasks as needed.

Cloud GPUs

For computational efficiency, consider running the model on cloud GPUs provided by services such as AWS, Google Cloud, or Azure.

License

This model is licensed under the Apache-2.0 License, permitting use, distribution, and modification under specified terms.

More Related APIs in Text Classification