bert base turkish sentiment cased

savasy

BERT-Base Turkish Sentiment Model

Introduction

The BERT-Base Turkish Sentiment Model is designed for sentiment analysis in the Turkish language. It is based on BERTurk and can be used to categorize sentiments in texts, such as movie and product reviews or tweets.

Architecture

The model is a fine-tuned version of the BERT architecture, specifically tailored for the Turkish language. It leverages transformer-based encoders to understand and classify text sentiment.

Training

The model was trained using data from various sources, including movie and product reviews, as well as tweets. The training process involved fine-tuning a BERT model (dbmdz/bert-base-turkish-uncased) with a dataset that was merged from previous studies.

  • Training Script:
    export GLUE_DIR="./sst-2-newall"
    export TASK_NAME=SST-2
    
    python3 run_glue.py \
      --model_type bert \
      --model_name_or_path dbmdz/bert-base-turkish-uncased\
      --task_name "SST-2" \
      --do_train \
      --do_eval \
      --data_dir "./sst-2-newall" \
      --max_seq_length 128 \
      --per_gpu_train_batch_size 32 \
      --learning_rate 2e-5 \
      --num_train_epochs 3.0 \
      --output_dir "./model"
    
  • Results: The model achieved an accuracy of approximately 95.4%.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Required Libraries: Ensure you have the transformers library installed.

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
    
    model = AutoModelForSequenceClassification.from_pretrained("savasy/bert-base-turkish-sentiment-cased")
    tokenizer = AutoTokenizer.from_pretrained("savasy/bert-base-turkish-sentiment-cased")
    sa = pipeline("sentiment-analysis", tokenizer=tokenizer, model=model)
    
  3. Perform Sentiment Analysis: Use the loaded model to analyze text sentiment.

    p = sa("bu telefon modelleri çok kaliteli , her parçası çok özel bence")
    print(p)  # Output: [{'label': 'LABEL_1', 'score': 0.9871089}]
    
  4. Cloud GPU Recommendation: For faster performance, consider using cloud services such as AWS, Google Cloud, or Azure, which provide GPU resources suitable for running transformer models efficiently.

License

Please refer to the Hugging Face model page for licensing details and terms of use.

More Related APIs in Text Classification