ko gpt trinity 1.2 B v0.5
sktIntroduction
Ko-GPT-Trinity 1.2B (V0.5) is a language model developed by SK Telecom, based on the GPT-3 architecture. It primarily focuses on the Korean language and is pre-trained with 1.2 billion parameters. The model excels at generating text from given prompts.
Architecture
Ko-GPT-Trinity 1.2B is a transformer-based language model with an autoregressive design, allowing it to predict the next token in a sequence. It is optimized for text generation tasks, leveraging its pre-trained understanding of the Korean language.
Training
The model was trained on Ko-DAT, a large-scale Korean dataset, processing 35 billion tokens over 72,000 steps using cross-entropy loss. This training setup enables the model to learn a rich representation of the Korean language, which is useful for various downstream tasks.
Guide: Running Locally
-
Environment Setup:
- Install PyTorch and Hugging Face's Transformers library.
- Clone the model repository from Hugging Face Hub.
-
Model Download:
- Use the Transformers library to download and load the Ko-GPT-Trinity 1.2B model.
-
Inference:
- Prepare a text prompt in Korean.
- Use the model to generate text based on the input prompt.
-
Hardware Recommendations:
- For optimal performance, especially for large-scale tasks, utilize cloud GPUs such as those available from AWS, Google Cloud, or Azure.
License
The model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). This license allows for sharing and adaptation for non-commercial purposes, provided proper attribution is given, and any derivative works follow the same licensing terms.