polyglot ko 1.3b
EleutherAIIntroduction
Polyglot-Ko-1.3B is a large-scale Korean autoregressive language model developed by EleutherAI's polyglot team. It is designed for text generation tasks using a transformer-based architecture and is implemented with the GPT-NeoX framework.
Architecture
Polyglot-Ko-1.3B consists of 24 transformer layers, with a model dimension of 2048 and a feedforward dimension of 8192. It uses 16 attention heads, each with a dimension of 128. The model employs Rotary Position Embedding (RoPE) for positional encoding across 64 dimensions per head. It is built with a tokenization vocabulary of 30,003 tokens.
Training
The model was trained on 863 GB of Korean data from various sources, including blogs, news, and more, following South Korean data laws. Sensitive information like bank accounts and phone numbers was masked during preprocessing. The training involved 213 billion tokens over 102,000 steps using 256 A100 GPUs.
Guide: Running Locally
To run Polyglot-Ko-1.3B locally, follow these steps:
-
Install Required Libraries: Ensure you have
transformers
andtorch
installed.pip install transformers torch
-
Load the Model: Use the following Python code to load the model and tokenizer.
from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("EleutherAI/polyglot-ko-1.3b") model = AutoModelForCausalLM.from_pretrained("EleutherAI/polyglot-ko-1.3b")
-
Run Inference: Use the model for generating text as needed.
For enhanced performance, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure to handle the computational load effectively.
License
Polyglot-Ko-1.3B is licensed under the Apache License 2.0. You can freely use, modify, and distribute the model under the terms of this license, which can be found at http://www.apache.org/licenses/LICENSE-2.0.