indonesia bert sentiment classification
mdhugolIntroduction
The Indonesian BERT Base Sentiment Classifier is a sentiment-text-classification model derived from the IndoBERT Base Model. It is fine-tuned using the Prosa sentiment dataset to classify text into positive, neutral, or negative sentiments.
Architecture
The model architecture is based on the BERT (Bidirectional Encoder Representations from Transformers) framework, specifically the IndoBERT Base Model. This model operates within the Transformers library and is implemented in PyTorch.
Training
The Indonesian BERT Base Sentiment Classifier was trained on the Prosa sentiment dataset, which is part of the IndoNLU benchmark. It uses the pre-trained IndoBERT Base Model (phase1 - uncased) as its starting point, fine-tuning it for sentiment analysis tasks.
Guide: Running Locally
To use the model for sentiment analysis, follow these steps:
-
Install the
transformers
library:pip install transformers
-
Load the model and tokenizer:
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification pretrained = "mdhugol/indonesia-bert-sentiment-classification" model = AutoModelForSequenceClassification.from_pretrained(pretrained) tokenizer = AutoTokenizer.from_pretrained(pretrained)
-
Create a sentiment analysis pipeline:
sentiment_analysis = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
-
Classify text sentiments:
label_index = {'LABEL_0': 'positive', 'LABEL_1': 'neutral', 'LABEL_2': 'negative'} pos_text = "Sangat bahagia hari ini" neg_text = "Dasar anak sialan!! Kurang ajar!!" result = sentiment_analysis(pos_text) status = label_index[result[0]['label']] score = result[0]['score'] print(f'Text: {pos_text} | Label : {status} ({score * 100:.3f}%)') result = sentiment_analysis(neg_text) status = label_index[result[0]['label']] score = result[0]['score'] print(f'Text: {neg_text} | Label : {status} ({score * 100:.3f}%)')
For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to handle the computational load.
License
The model is available for use under the terms specified on its Hugging Face model card or repository. Ensure compliance with any licensing requirements before using the model in production.