nb bert large

NbAiLab

Introduction

NB-BERT-large is a Norwegian language model based on BERT-large architecture. Developed by the National Library of Norway, it is trained on a diverse collection of Norwegian text in both bokmål and nynorsk languages, utilizing a monolingual Norwegian vocabulary.

Architecture

NB-BERT-large is built upon the BERT-large architecture, which is known for its transformative capabilities in natural language processing tasks. This architecture is particularly well-suited for tasks such as fill-mask, leveraging the vast corpus of Norwegian texts.

Training

The model was trained from scratch on a comprehensive dataset consisting of Norwegian text spanning the past 200 years. This extensive training data ensures that the model is well-equipped to handle various linguistic nuances in both bokmål and nynorsk. The training set and further details can be accessed through the NBAiLab's GitHub repository.

Guide: Running Locally

To run NB-BERT-large locally, follow these steps:

  1. Environment Setup:

    • Install Python and required libraries such as PyTorch or TensorFlow.
    • Use a package manager like pip to install the Hugging Face Transformers library.
  2. Model Download:

    • Download the model from Hugging Face's model hub using the Transformers library.
  3. Inference:

    • Load the model in your script and run inference tasks like fill-mask.
  4. Hardware Suggestions:

    • It is recommended to use cloud-based GPUs for efficient processing, such as those offered by AWS, Google Cloud, or Azure.

License

NB-BERT-large is released under the Creative Commons Attribution 4.0 International License (cc-by-4.0). This allows users to share and adapt the model, provided appropriate credit is given.

More Related APIs in Fill Mask