gbert base finetuned cefr
BramVanroyIntroduction
The GBERT-BASE-FINETUNED-CEFR model is a fine-tuned version of the German BERT model designed for text classification tasks, specifically for predicting CEFR proficiency levels. CEFR, or the Common European Framework of Reference for Languages, is a standard for gauging language proficiency.
Architecture
This model builds on the BERT architecture, which is known for its transformer-based approach to natural language processing tasks. It was fine-tuned using datasets such as merlin and disko to improve its proficiency assessment capabilities for written German text.
Training
The model achieves various performance metrics such as:
- Accuracy: 0.83
- F1 Score: 0.83
- Precision: 0.84
- Recall: 0.83
- Quadratic Weighted Kappa (QWK): 0.95
These metrics highlight its effectiveness in classifying text according to CEFR proficiency levels.
Guide: Running Locally
To run the model locally, follow these steps:
- Set up the Environment: Ensure Python and PyTorch are installed on your system.
- Clone the Repository: Use Git to clone the model repository.
git clone https://huggingface.co/BramVanroy/gbert-base-finetuned-cefr
- Install Dependencies: Navigate into the cloned directory and install the necessary Python packages.
cd gbert-base-finetuned-cefr pip install -r requirements.txt
- Run Inference: Load the model using the
transformers
library and input your text for classification.
For optimal performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.
License
This model is released under the MIT License, allowing for wide usage and modification with minimal restrictions.