eagle 3b preview
etri-lirsIntroduction
EAGLE-3B is a decoder-only, causal language model developed by ETRI's Language Intelligence Research Section. It is designed for STEM fields, focusing on tasks like mathematics and quantitative reasoning. The model is a foundational pre-trained model without instruction tuning and requires fine-tuning for specific applications such as chatbot interactions.
Architecture
The model consists of 3.1 billion parameters and utilizes a LLaMA-compatible architecture. It employs a tokenizer similar to LLaMa's byte-fallbacked BPE with digit separation. The model was pre-trained from scratch using 720 billion tokens on A100 80GB PCIE * 8 GPUs.
Training
EAGLE-3B was trained using various datasets, including AIHub, KISTI, and Korean Wikipedia, among others. The training process is subject to periodic updates due to changes in data and methodology.
Guide: Running Locally
To run the model locally, use the following steps:
- Ensure Environment Setup: Install
transformers
version 4.28 or higher. - Load Model:
from transformers import AutoTokenizer, AutoModelForCausalLM def load_model(mdl_path): tokenizer = AutoTokenizer.from_pretrained(mdl_path) model = AutoModelForCausalLM.from_pretrained(mdl_path, device_map="auto", torch_dtype="auto") return tokenizer, model tokenizer, model = load_model("etri-lirs/egpt-3b-preview")
- Run Inference: Modify the generation configuration and run the model with your input through standard input.
Consider using cloud GPU services like AWS EC2, Google Cloud, or Azure for efficient processing.
License
The model is available for research and educational purposes only. Users must agree to the collection and use of personal information for accessing the model. The developers reserve the right to limit or revoke access if legal or social issues arise.