Erlangshen Megatron Bert 1.3 B Sentiment
IDEA-CCNLIntroduction
Erlangshen-MegatronBert-1.3B-Sentiment is a fine-tuned version of the Chinese BERT model optimized for sentiment analysis tasks. It achieved top performance on the FewCLUE and ZeroCLUE benchmarks in 2021.
Architecture
The model is based on Erlangshen-MegatronBert-1.3B and has been fine-tuned with 227,347 samples from eight Chinese sentiment analysis datasets. The architecture leverages the MegatronBERT framework, which is part of the BERT series, and focuses on natural language understanding (NLU) for sentiment analysis.
Training
The model was fine-tuned using a collection of datasets tailored to Chinese sentiment analysis. It demonstrated high performance across multiple benchmarks, such as ASAP-SENT, ASAP-ASPECT, and ChnSentiCorp, with the following performance metrics:
- ASAP-SENT: 98.1
- ASAP-ASPECT: 97.8
- ChnSentiCorp: 97.0
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies: Ensure you have Python and PyTorch installed. Install the Transformers library using pip:
pip install transformers
-
Load the Model and Tokenizer:
from transformers import AutoModelForSequenceClassification, BertTokenizer import torch tokenizer = BertTokenizer.from_pretrained('IDEA-CCNL/Erlangshen-MegatronBert-1.3B-Sentiment') model = AutoModelForSequenceClassification.from_pretrained('IDEA-CCNL/Erlangshen-MegatronBert-1.3B-Sentiment')
-
Inference:
text = '今天心情不好' output = model(torch.tensor([tokenizer.encode(text)])) print(torch.nn.functional.softmax(output.logits, dim=-1))
Suggested Cloud GPUs
For running large models like Erlangshen-MegatronBert-1.3B efficiently, consider using cloud services such as AWS, Google Cloud, or Azure, which offer GPU instances tailored for deep learning tasks.
License
This model is released under the Apache 2.0 license, allowing for both personal and commercial use with minimal restrictions.