indonesia bert sentiment classification

mdhugol

Introduction

The Indonesian BERT Base Sentiment Classifier is a sentiment-text-classification model derived from the IndoBERT Base Model. It is fine-tuned using the Prosa sentiment dataset to classify text into positive, neutral, or negative sentiments.

Architecture

The model architecture is based on the BERT (Bidirectional Encoder Representations from Transformers) framework, specifically the IndoBERT Base Model. This model operates within the Transformers library and is implemented in PyTorch.

Training

The Indonesian BERT Base Sentiment Classifier was trained on the Prosa sentiment dataset, which is part of the IndoNLU benchmark. It uses the pre-trained IndoBERT Base Model (phase1 - uncased) as its starting point, fine-tuning it for sentiment analysis tasks.

Guide: Running Locally

To use the model for sentiment analysis, follow these steps:

  1. Install the transformers library:

    pip install transformers
    
  2. Load the model and tokenizer:

    from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
    
    pretrained = "mdhugol/indonesia-bert-sentiment-classification"
    model = AutoModelForSequenceClassification.from_pretrained(pretrained)
    tokenizer = AutoTokenizer.from_pretrained(pretrained)
    
  3. Create a sentiment analysis pipeline:

    sentiment_analysis = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
    
  4. Classify text sentiments:

    label_index = {'LABEL_0': 'positive', 'LABEL_1': 'neutral', 'LABEL_2': 'negative'}
    
    pos_text = "Sangat bahagia hari ini"
    neg_text = "Dasar anak sialan!! Kurang ajar!!"
    
    result = sentiment_analysis(pos_text)
    status = label_index[result[0]['label']]
    score = result[0]['score']
    print(f'Text: {pos_text} | Label : {status} ({score * 100:.3f}%)')
    
    result = sentiment_analysis(neg_text)
    status = label_index[result[0]['label']]
    score = result[0]['score']
    print(f'Text: {neg_text} | Label : {status} ({score * 100:.3f}%)')
    

For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to handle the computational load.

License

The model is available for use under the terms specified on its Hugging Face model card or repository. Ensure compliance with any licensing requirements before using the model in production.

More Related APIs in Text Classification