K R Fin Bert S C

snunlp

Introduction

KR-FinBert-SC is a specialized BERT model designed for sentiment analysis in the financial domain. It builds on the KR-FinBert model, which is pre-trained on a financial corpus. The model shows significant performance improvements through domain adaptation and fine-tuning on labeled data.

Architecture

KR-FinBert-SC is based on the BERT architecture and has been specifically adapted for the Korean financial domain. The model benefits from pre-training on a diverse set of texts including news articles, legal documents, and financial reports, allowing it to effectively handle sentiment classification tasks.

Training

The model's training data includes an expanded corpus from KR-BERT-MEDIUM, consisting of texts from Korean Wikipedia, news articles, legal texts, and a dataset of Korean comments. For transfer learning, additional data from economic news articles and analyst reports are included, totaling 13.22GB. The model was trained for 5.5 million steps, with a maximum sequence length of 512, a batch size of 32, and a learning rate of 5e-5, using an NVIDIA TITAN XP GPU over 67.48 hours.

Guide: Running Locally

  1. Environment Setup: Install Python and necessary libraries, such as PyTorch and Transformers.
  2. Clone Repository: Download the model files from the Hugging Face model card.
  3. Load Model: Use the Transformers library to load the KR-FinBert-SC model and tokenizer.
  4. Inference: Prepare your data and run the model for sentiment analysis using the provided scripts.

For efficient performance, consider using cloud GPUs like AWS EC2 with NVIDIA GPUs or Google Cloud TPU.

License

The KR-FinBert-SC model is published under a license available on its GitHub repository. Please review the specific terms and conditions before use.

More Related APIs in Text Classification