jobbert base cased
jjzhaIntroduction
JobBERT is a language model designed for extracting hard and soft skills from English job postings. It is built upon a bert-base-cased checkpoint and trained on approximately 3.2 million sentences derived from job listings. The model was developed by Mike Zhang, Kristian Nørgaard Jensen, Sif Dam Sonniks, and Barbara Plank and is documented in the paper "SkillSpan: Hard and Soft Skill Extraction from Job Postings."
Architecture
JobBERT is based on the BERT architecture, specifically utilizing the bert-base-cased variant. It has been continuously pre-trained to enhance its performance in the domain of job postings.
Training
The model is further trained with a focus on sentences from job postings to improve its ability to extract skills. The continuous pre-training on this specific domain allows JobBERT to outperform other models that are not domain-adapted.
Guide: Running Locally
- Setup Environment: Ensure you have Python and PyTorch installed.
- Install Transformers Library: Use
pip install transformers
to get the Hugging Face library. - Download the Model: Use the Hugging Face
transformers
library to load JobBERT. - Run Inference: Utilize the model to perform skill extraction tasks on job postings.
For optimal performance, especially with large datasets, consider using cloud GPUs such as those available on AWS, Google Cloud, or Azure.
License
Please refer to the Hugging Face model page or repository for specific licensing information related to JobBERT.