bert large swedish cased
AI-NordicsIntroduction
The BERT-LARGE-SWEDISH-CASED model is a Swedish language model based on the BERT Large architecture. It is implemented using the Megatron-LM framework and is suitable for tasks such as masked language modeling or next sentence prediction.
Architecture
This model adheres to the BERT Large architecture and is configured with the following parameters:
- Number of Parameters: 340 million
- Number of Layers: 24
- Number of Attention Heads: 16
- Context Size: 1024
- Vocabulary Size: 30,592
Training
The model was pre-trained over 600,000 steps with a batch size of 512. The training data includes a diverse Swedish text corpus of approximately 85 GB, sourced from:
- Politics (e.g., Anföranden, DCEP, DGT, SOU)
- Medical (e.g., Fass)
- Legal (e.g., Författningar, JRC)
- Miscellaneous web data
- Books (e.g., Litteraturbanken)
- Drama (e.g., Subtitles)
- Wikipedia
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the Transformers Library:
pip install transformers
-
Load the Model and Tokenizer:
from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("AI-Nordics/bert-large-swedish-cased") model = AutoModelForMaskedLM.from_pretrained("AI-Nordics/bert-large-swedish-cased")
-
Use a Cloud GPU: For efficient execution, consider using cloud services like AWS, Google Cloud, or Azure that offer GPU instances.
License
The model is available under a specific license; ensure to review the licensing terms before use.