bert large swedish cased

AI-Nordics

Introduction

The BERT-LARGE-SWEDISH-CASED model is a Swedish language model based on the BERT Large architecture. It is implemented using the Megatron-LM framework and is suitable for tasks such as masked language modeling or next sentence prediction.

Architecture

This model adheres to the BERT Large architecture and is configured with the following parameters:

  • Number of Parameters: 340 million
  • Number of Layers: 24
  • Number of Attention Heads: 16
  • Context Size: 1024
  • Vocabulary Size: 30,592

Training

The model was pre-trained over 600,000 steps with a batch size of 512. The training data includes a diverse Swedish text corpus of approximately 85 GB, sourced from:

  • Politics (e.g., Anföranden, DCEP, DGT, SOU)
  • Medical (e.g., Fass)
  • Legal (e.g., Författningar, JRC)
  • Miscellaneous web data
  • Books (e.g., Litteraturbanken)
  • Drama (e.g., Subtitles)
  • Wikipedia

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the Transformers Library:

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import AutoTokenizer, AutoModelForMaskedLM
    
    tokenizer = AutoTokenizer.from_pretrained("AI-Nordics/bert-large-swedish-cased")
    model = AutoModelForMaskedLM.from_pretrained("AI-Nordics/bert-large-swedish-cased")
    
  3. Use a Cloud GPU: For efficient execution, consider using cloud services like AWS, Google Cloud, or Azure that offer GPU instances.

License

The model is available under a specific license; ensure to review the licensing terms before use.

More Related APIs in Fill Mask