pko t5 large

paust

Introduction

PKO-T5-LARGE is a Korean language model based on T5 v1.1, primarily used for text-to-text generation tasks. It leverages unsupervised learning with span corruption tasks and utilizes a BBPE tokenizer for Korean data to avoid out-of-vocabulary issues.

Architecture

The model is a variant of the T5 architecture, designed specifically for the Korean language. It replaces the standard sentencepiece tokenizer with a BBPE tokenizer to handle Korean text more effectively. The model is trained on various Korean datasets such as Namuwiki, Wikipedia, and others.

Training

PKO-T5-LARGE was trained using unsupervised learning methods, focusing on the span corruption task. Fine-tuning is recommended for specific tasks when using this model. The training process involves adapting the model to handle Korean text efficiently, ensuring high performance in various evaluation metrics.

Guide: Running Locally

To run PKO-T5-LARGE locally:

  1. Install Required Packages:

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import T5TokenizerFast, T5ForConditionalGeneration
    
    tokenizer = T5TokenizerFast.from_pretrained('paust/pko-t5-large')
    model = T5ForConditionalGeneration.from_pretrained('paust/pko-t5-large')
    
  3. Run Inference:

    input_ids = tokenizer(["qa question: 당신의 이름은 무엇인가요?"]).input_ids
    labels = tokenizer(["T5 입니다."]).input_ids
    outputs = model(input_ids=input_ids, labels=labels)
    print(f"loss={outputs.loss} logits={outputs.logits}")
    

For optimal performance, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure.

License

PKO-T5-LARGE is released under the MIT license, allowing for wide-ranging use and modification. More details can be found in the license file.

More Related APIs in Text2text Generation