Randeng T5 77 M Multi Task Chinese

IDEA-CCNL

Introduction

The RANDENG-T5-77M-MULTITASK-CHINESE model is a fine-tuned version of the Randeng-T5-77M model, designed for multi-task learning in Chinese. It leverages over 100 datasets for various natural language processing tasks such as sentiment analysis, text classification, and keyword extraction.

Architecture

The model is built upon the T5 architecture, which is a transformer-based model for text-to-text generation tasks. It uses a unified text-to-text format to handle multiple tasks, allowing it to perform a wide range of NLP tasks effectively.

Training

The model was fine-tuned using more than 100 Chinese datasets, resulting in a multitasking version that includes tasks like sentiment analysis, news classification, natural language inference, and more. Over 300,000 samples were used in this process, enhancing the model's ability to handle various language processing tasks.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies:

    pip install torch transformers
    
  2. Load the Model and Tokenizer:

    import torch
    from transformers import T5Tokenizer, T5Config, T5ForConditionalGeneration
    
    pretrained_model = "IDEA-CCNL/Randeng-T5-77M-MultiTask-Chinese"
    tokenizer = T5Tokenizer.from_pretrained(pretrained_model)
    config = T5Config.from_pretrained(pretrained_model)
    model = T5ForConditionalGeneration.from_pretrained(pretrained_model, config=config)
    model.eval()
    
  3. Prepare Input Text:

    text = "情感分析任务:【房间还是比较舒适的,酒店服务良好】这篇文章的情感态度是什么?正面/负面"
    encode_dict = tokenizer(text, max_length=512, padding='max_length', truncation=True)
    inputs = {
      "input_ids": torch.tensor([encode_dict['input_ids']]).long(),
      "attention_mask": torch.tensor([encode_dict['attention_mask']]).long(),
    }
    
  4. Generate Output:

    logits = model.generate(input_ids=inputs['input_ids'], max_length=100, early_stopping=True)
    predict_label = [tokenizer.decode(i, skip_special_tokens=True) for i in logits]
    print(predict_label)
    

For optimal performance, consider using cloud GPUs such as those available from AWS, Google Cloud, or Azure, to handle the computational requirements of running the model.

License

The RANDENG-T5-77M-MULTITASK-CHINESE model is licensed under the Apache 2.0 License, allowing for open use and modification.

More Related APIs in Text2text Generation