job G B E R T
agneIntroduction
JOBGBERT is a domain-adapted, transformer-based language model designed to process German-speaking job advertisements. It is built upon the deepset/gbert-base model and further trained with 4 million job ads from Switzerland, spanning the years 1990 to 2020.
Architecture
- Model Type: BERT base
- Language: German
- Domain: Job advertisements
Training
JOBGBERT was adapted to its specific domain through continued in-domain pretraining using a large dataset of job advertisements. This dataset consists of 5.9 GB of text extracted from Swiss job advertisements over three decades.
Guide: Running Locally
- Installation: Ensure you have Python and PyTorch installed. Use the Hugging Face Transformers library to load the model.
- Loading the Model: Use the
from_pretrained
method to loadagne/jobGBERT
. - Inference: Implement the fill-mask pipeline for masked language modeling tasks.
For optimal performance, consider using cloud GPU services like AWS, Google Cloud, or Azure.
License
JOBGBERT is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (cc-by-nc-sa-4.0). Proper citation is required when using this model in academic or research contexts.