roberta base finetuned jd full chinese
uerIntroduction
This project provides a set of five Chinese RoBERTa-Base models fine-tuned for text classification. These models were fine-tuned using UER-py and TencentPretrain, offering support for models with over one billion parameters and extending to a multimodal pre-training framework.
Architecture
The models are based on the RoBERTa architecture, pre-trained specifically for the Chinese language. The fine-tuning process involves using a sequence length of 512 and the pre-trained chinese_roberta_L-12_H-768
as a base model.
Training
The models are fine-tuned on five Chinese text classification datasets: JD full, JD binary, Dianping, Ifeng, and Chinanews. These datasets include user reviews and news articles categorized by sentiment and topic. The training involves three epochs, using a learning rate of 3e-5 and a batch size of 32. The fine-tuning process is conducted on Tencent Cloud.
Guide: Running Locally
-
Install Dependencies: Ensure you have Python and the
transformers
library installed.pip install transformers
-
Load Model: Use the
transformers
library to load and utilize the model for text classification.from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline model = AutoModelForSequenceClassification.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese') tokenizer = AutoTokenizer.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese') text_classification = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) result = text_classification("北京上个月召开了两会") print(result)
-
Cloud GPUs: Consider using cloud services like AWS, GCP, or Azure for faster processing with GPU capabilities.
License
No specific license information is provided in the provided documentation. Be sure to check the Hugging Face model page or the associated GitHub repository for licensing details.