Erlangshen Megatron Bert 1.3 B Sentiment LLM Model

Introduction

Erlangshen-MegatronBert-1.3B-Sentiment is a fine-tuned version of the Chinese BERT model optimized for sentiment analysis tasks. It achieved top performance on the FewCLUE and ZeroCLUE benchmarks in 2021.

Architecture

The model is based on Erlangshen-MegatronBert-1.3B and has been fine-tuned with 227,347 samples from eight Chinese sentiment analysis datasets. The architecture leverages the MegatronBERT framework, which is part of the BERT series, and focuses on natural language understanding (NLU) for sentiment analysis.

Training

The model was fine-tuned using a collection of datasets tailored to Chinese sentiment analysis. It demonstrated high performance across multiple benchmarks, such as ASAP-SENT, ASAP-ASPECT, and ChnSentiCorp, with the following performance metrics:

ASAP-SENT: 98.1
ASAP-ASPECT: 97.8
ChnSentiCorp: 97.0

Guide: Running Locally

To run the model locally, follow these steps:

Install Dependencies: Ensure you have Python and PyTorch installed. Install the Transformers library using pip:
```
pip install transformers
```

Load the Model and Tokenizer:

from transformers import AutoModelForSequenceClassification, BertTokenizer
import torch

tokenizer = BertTokenizer.from_pretrained('IDEA-CCNL/Erlangshen-MegatronBert-1.3B-Sentiment')
model = AutoModelForSequenceClassification.from_pretrained('IDEA-CCNL/Erlangshen-MegatronBert-1.3B-Sentiment')

Inference:

text = '今天心情不好'
output = model(torch.tensor([tokenizer.encode(text)]))
print(torch.nn.functional.softmax(output.logits, dim=-1))

Suggested Cloud GPUs

For running large models like Erlangshen-MegatronBert-1.3B efficiently, consider using cloud services such as AWS, Google Cloud, or Azure, which offer GPU instances tailored for deep learning tasks.

License

This model is released under the Apache 2.0 license, allowing for both personal and commercial use with minimal restrictions.

More Related APIs in Text Classification