gbert base finetuned cefr

BramVanroy

Introduction

The GBERT-BASE-FINETUNED-CEFR model is a fine-tuned version of the German BERT model designed for text classification tasks, specifically for predicting CEFR proficiency levels. CEFR, or the Common European Framework of Reference for Languages, is a standard for gauging language proficiency.

Architecture

This model builds on the BERT architecture, which is known for its transformer-based approach to natural language processing tasks. It was fine-tuned using datasets such as merlin and disko to improve its proficiency assessment capabilities for written German text.

Training

The model achieves various performance metrics such as:

  • Accuracy: 0.83
  • F1 Score: 0.83
  • Precision: 0.84
  • Recall: 0.83
  • Quadratic Weighted Kappa (QWK): 0.95

These metrics highlight its effectiveness in classifying text according to CEFR proficiency levels.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Set up the Environment: Ensure Python and PyTorch are installed on your system.
  2. Clone the Repository: Use Git to clone the model repository.
    git clone https://huggingface.co/BramVanroy/gbert-base-finetuned-cefr
    
  3. Install Dependencies: Navigate into the cloned directory and install the necessary Python packages.
    cd gbert-base-finetuned-cefr
    pip install -r requirements.txt
    
  4. Run Inference: Load the model using the transformers library and input your text for classification.

For optimal performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

This model is released under the MIT License, allowing for wide usage and modification with minimal restrictions.

More Related APIs in Text Classification