Introduction

Recorded Future, in collaboration with AI Sweden, has released a Swedish Named Entity Recognition (NER) model. It is designed for entity detection in Swedish and is based on the KB/bert-base-swedish-cased model. The model has been fine-tuned using data from various internet sources and forums, specifically for Swedish language inputs.

Architecture

The model architecture is built upon the BERT framework, specifically the KB/bert-base-swedish-cased variant. It leverages the Transformers library and is compatible with PyTorch. The model identifies entities such as locations, organizations, persons, religions, and titles within Swedish texts.

Training

The model has been trained with Swedish data and is optimized for Swedish inputs. It does not support non-Swedish inputs, which are considered out-of-domain. The model's training and evaluation were performed using Transformers version >= 4.3.3 and Torch version 1.8.0.

Evaluation Metrics

The model's performance was evaluated using an F1-score metric across different entity categories:

  • Location: 0.91
  • Organization: 0.88
  • Person: 0.96
  • Religion: 0.95
  • Title: 0.84
  • Total: 0.92

Guide: Running Locally

  1. Install Prerequisites: Ensure that you have Python, PyTorch, and the Transformers library installed. Use the following command to install the necessary libraries:
    pip install torch transformers
    
  2. Download the Model: Visit the Hugging Face model page and download the model files.
  3. Load the Model: Use the Transformers library to load the model into your environment.
  4. Inference: Prepare Swedish text inputs and run them through the model for entity recognition.

Suggested Cloud GPUs

For optimal performance, consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure, which provide powerful resources for model inference.

License

The Swedish NER model is released under the MIT License, allowing for broad use and modification.

More Related APIs in Token Classification