pko t5 large LLM Model — Open LLM List

Introduction

PKO-T5-LARGE is a Korean language model based on T5 v1.1, primarily used for text-to-text generation tasks. It leverages unsupervised learning with span corruption tasks and utilizes a BBPE tokenizer for Korean data to avoid out-of-vocabulary issues.

Architecture

The model is a variant of the T5 architecture, designed specifically for the Korean language. It replaces the standard sentencepiece tokenizer with a BBPE tokenizer to handle Korean text more effectively. The model is trained on various Korean datasets such as Namuwiki, Wikipedia, and others.

Training

PKO-T5-LARGE was trained using unsupervised learning methods, focusing on the span corruption task. Fine-tuning is recommended for specific tasks when using this model. The training process involves adapting the model to handle Korean text efficiently, ensuring high performance in various evaluation metrics.

Guide: Running Locally

To run PKO-T5-LARGE locally:

Install Required Packages:
```
pip install transformers
```

Load the Model and Tokenizer:

from transformers import T5TokenizerFast, T5ForConditionalGeneration

tokenizer = T5TokenizerFast.from_pretrained('paust/pko-t5-large')
model = T5ForConditionalGeneration.from_pretrained('paust/pko-t5-large')

Run Inference:

input_ids = tokenizer(["qa question: 당신의 이름은 무엇인가요?"]).input_ids
labels = tokenizer(["T5 입니다."]).input_ids
outputs = model(input_ids=input_ids, labels=labels)
print(f"loss={outputs.loss} logits={outputs.logits}")

For optimal performance, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure.

License

PKO-T5-LARGE is released under the MIT license, allowing for wide-ranging use and modification. More details can be found in the license file.

More Related APIs in Text2text Generation