Randeng T5 77 M Multi Task Chinese
IDEA-CCNLIntroduction
The RANDENG-T5-77M-MULTITASK-CHINESE model is a fine-tuned version of the Randeng-T5-77M model, designed for multi-task learning in Chinese. It leverages over 100 datasets for various natural language processing tasks such as sentiment analysis, text classification, and keyword extraction.
Architecture
The model is built upon the T5 architecture, which is a transformer-based model for text-to-text generation tasks. It uses a unified text-to-text format to handle multiple tasks, allowing it to perform a wide range of NLP tasks effectively.
Training
The model was fine-tuned using more than 100 Chinese datasets, resulting in a multitasking version that includes tasks like sentiment analysis, news classification, natural language inference, and more. Over 300,000 samples were used in this process, enhancing the model's ability to handle various language processing tasks.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies:
pip install torch transformers
-
Load the Model and Tokenizer:
import torch from transformers import T5Tokenizer, T5Config, T5ForConditionalGeneration pretrained_model = "IDEA-CCNL/Randeng-T5-77M-MultiTask-Chinese" tokenizer = T5Tokenizer.from_pretrained(pretrained_model) config = T5Config.from_pretrained(pretrained_model) model = T5ForConditionalGeneration.from_pretrained(pretrained_model, config=config) model.eval()
-
Prepare Input Text:
text = "情感分析任务:【房间还是比较舒适的,酒店服务良好】这篇文章的情感态度是什么?正面/负面" encode_dict = tokenizer(text, max_length=512, padding='max_length', truncation=True) inputs = { "input_ids": torch.tensor([encode_dict['input_ids']]).long(), "attention_mask": torch.tensor([encode_dict['attention_mask']]).long(), }
-
Generate Output:
logits = model.generate(input_ids=inputs['input_ids'], max_length=100, early_stopping=True) predict_label = [tokenizer.decode(i, skip_special_tokens=True) for i in logits] print(predict_label)
For optimal performance, consider using cloud GPUs such as those available from AWS, Google Cloud, or Azure, to handle the computational requirements of running the model.
License
The RANDENG-T5-77M-MULTITASK-CHINESE model is licensed under the Apache 2.0 License, allowing for open use and modification.