bert base indonesian N E R

cahya

Introduction

The BERT-BASE-INDONESIAN-NER model, hosted on Hugging Face, is designed for token classification tasks specifically in the Indonesian language. It is built using the Transformers library and supports PyTorch and JAX frameworks.

Architecture

This model leverages the BERT architecture, known for its effectiveness in natural language processing tasks. It has been fine-tuned for Named Entity Recognition (NER) tasks in Indonesian.

Training

The model has been trained using the Transformers library, focusing on the token classification pipeline. Details on the dataset and training parameters used for fine-tuning are not specified in this document.

Guide: Running Locally

  1. Install Dependencies: Ensure you have Python and PyTorch installed, along with the Transformers library:

    pip install torch transformers
    
  2. Load the Model: Use the Transformers library to load the model:

    from transformers import BertTokenizer, BertForTokenClassification
    
    tokenizer = BertTokenizer.from_pretrained("cahya/bert-base-indonesian-NER")
    model = BertForTokenClassification.from_pretrained("cahya/bert-base-indonesian-NER")
    
  3. Inference: Tokenize your input text and pass it to the model for predictions.

  4. Hardware Recommendations: For optimal performance, especially with large datasets or longer texts, consider using cloud GPU services like AWS, Google Cloud, or Azure.

License

The BERT-BASE-INDONESIAN-NER model is distributed under the MIT License, allowing for flexibility in use and modification.

More Related APIs in Token Classification