pko t5 large
paustIntroduction
PKO-T5-LARGE is a Korean language model based on T5 v1.1, primarily used for text-to-text generation tasks. It leverages unsupervised learning with span corruption tasks and utilizes a BBPE tokenizer for Korean data to avoid out-of-vocabulary issues.
Architecture
The model is a variant of the T5 architecture, designed specifically for the Korean language. It replaces the standard sentencepiece tokenizer with a BBPE tokenizer to handle Korean text more effectively. The model is trained on various Korean datasets such as Namuwiki, Wikipedia, and others.
Training
PKO-T5-LARGE was trained using unsupervised learning methods, focusing on the span corruption task. Fine-tuning is recommended for specific tasks when using this model. The training process involves adapting the model to handle Korean text efficiently, ensuring high performance in various evaluation metrics.
Guide: Running Locally
To run PKO-T5-LARGE locally:
-
Install Required Packages:
pip install transformers
-
Load the Model and Tokenizer:
from transformers import T5TokenizerFast, T5ForConditionalGeneration tokenizer = T5TokenizerFast.from_pretrained('paust/pko-t5-large') model = T5ForConditionalGeneration.from_pretrained('paust/pko-t5-large')
-
Run Inference:
input_ids = tokenizer(["qa question: 당신의 이름은 무엇인가요?"]).input_ids labels = tokenizer(["T5 입니다."]).input_ids outputs = model(input_ids=input_ids, labels=labels) print(f"loss={outputs.loss} logits={outputs.logits}")
For optimal performance, consider using cloud GPU services like AWS EC2, Google Cloud Platform, or Azure.
License
PKO-T5-LARGE is released under the MIT license, allowing for wide-ranging use and modification. More details can be found in the license file.