roberta base finetuned chinanews chinese
uerIntroduction
This document describes the Chinese RoBERTa-Base models fine-tuned for text classification tasks. Developed using UER-py, these models support classification across various Chinese datasets and can be further fine-tuned using TencentPretrain.
Architecture
The models are based on the RoBERTa architecture, tailored for Chinese text classification. They leverage a pre-trained model, chinese_roberta_L-12_H-768
, with 12 layers and a hidden size of 768, optimized for handling Chinese language processing tasks.
Training
Training involves fine-tuning on five Chinese text classification datasets, including JD full, JD binary, Dianping, Ifeng, and Chinanews. The training is conducted over three epochs with a sequence length of 512 using the UER-py toolkit on Tencent Cloud. Hyperparameters are consistent across different models, and the model is saved at the end of each epoch if it shows the best performance on the development set. The final models are converted to Hugging Face's format for ease of deployment.
Guide: Running Locally
-
Install Dependencies: Ensure you have Python and the
transformers
library installed. -
Download Model: Use the Hugging Face model hub to download the required model.
-
Load Model and Tokenizer:
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline model = AutoModelForSequenceClassification.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese') tokenizer = AutoTokenizer.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese') text_classification = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
-
Perform Inference:
text_classification("北京上个月召开了两会")
This should yield classification results with labels and scores.
-
Consider Cloud GPUs: For enhanced performance, especially with large datasets or batch processing, consider using cloud-based GPU resources such as AWS, Google Cloud, or Azure.
License
The model and its components are governed by the licenses provided by UER-py and TencentPretrain. Check their respective repositories for detailed licensing information.