roberta base finetuned dianping chinese
uerIntroduction
The Chinese RoBERTa-Base models for text classification have been fine-tuned by UER-py, a toolkit for pre-training models. These models are designed to handle various Chinese text classification tasks by leveraging datasets such as JD binary, Dianping, and others. The models are available for download and use via both UER-py and Hugging Face platforms.
Architecture
These models are based on the RoBERTa architecture, specifically the Chinese RoBERTa-Base model with 12 layers and a hidden size of 768. They are fine-tuned to handle multiple Chinese text classification tasks using different datasets.
Training
The models are fine-tuned using UER-py on Tencent Cloud. The training process involves three epochs with a sequence length of 512, based on the pre-trained chinese_roberta_L-12_H-768
model. Each model is saved at the end of an epoch that achieves the best performance on the development set. The training procedure includes using hyper-parameters uniformly across different models.
Guide: Running Locally
-
Installation: Ensure you have the
transformers
library installed.pip install transformers
-
Code Example: Use the following Python code to perform text classification with the model.
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline model = AutoModelForSequenceClassification.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese') tokenizer = AutoTokenizer.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese') text_classification = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) result = text_classification("北京上个月召开了两会") print(result)
-
Cloud GPUs: It is recommended to use cloud GPU services such as AWS, GCP, or Azure for efficient model training and inference.
License
The models and the UER-py framework are open-source and are available under licenses that allow free use and distribution.