distilbert base turkish cased
dbmdzIntroduction
DistilBERTurk is a distilled version of the BERT model specifically tailored for the Turkish language. It is a community-driven project by the MDZ Digital Library team at the Bavarian State Library, intended to provide a smaller, faster, and more efficient alternative to the original BERTurk model.
Architecture
DistilBERTurk is based on the DistilBERT architecture, which is a lighter version of the BERT model. It leverages the same transformer-based architecture but reduces the number of parameters to increase efficiency while maintaining comparable performance.
Training
The model was trained on 7GB of data using the cased version of BERTurk as the teacher model. The training process was conducted using the official Hugging Face implementation over five days on four RTX 2080 TI GPUs. The model focuses on tasks such as PoS tagging and NER, where it has shown competitive performance.
Guide: Running Locally
-
Environment Setup: Ensure you have Python and PyTorch installed. Install the Transformers library with:
pip install transformers
-
Loading the Model: Use the following code to load the model and tokenizer:
from transformers import AutoModel, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("dbmdz/distilbert-base-turkish-cased") model = AutoModel.from_pretrained("dbmdz/distilbert-base-turkish-cased")
-
Model Weights: Currently, only PyTorch-compatible weights are available. For TensorFlow checkpoints, requests can be made in the BERTurk repository.
-
Cloud GPUs: For performance and faster inference, consider using cloud GPU services such as Google Colab, AWS, or Azure.
License
DistilBERTurk is released under the MIT license, allowing for free use, modification, and distribution of the model.