eagle 3b preview

etri-lirs

Introduction

EAGLE-3B is a decoder-only, causal language model developed by ETRI's Language Intelligence Research Section. It is designed for STEM fields, focusing on tasks like mathematics and quantitative reasoning. The model is a foundational pre-trained model without instruction tuning and requires fine-tuning for specific applications such as chatbot interactions.

Architecture

The model consists of 3.1 billion parameters and utilizes a LLaMA-compatible architecture. It employs a tokenizer similar to LLaMa's byte-fallbacked BPE with digit separation. The model was pre-trained from scratch using 720 billion tokens on A100 80GB PCIE * 8 GPUs.

Training

EAGLE-3B was trained using various datasets, including AIHub, KISTI, and Korean Wikipedia, among others. The training process is subject to periodic updates due to changes in data and methodology.

Guide: Running Locally

To run the model locally, use the following steps:

  1. Ensure Environment Setup: Install transformers version 4.28 or higher.
  2. Load Model:
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    def load_model(mdl_path):
        tokenizer = AutoTokenizer.from_pretrained(mdl_path)
        model = AutoModelForCausalLM.from_pretrained(mdl_path, device_map="auto", torch_dtype="auto")
        return tokenizer, model
    
    tokenizer, model = load_model("etri-lirs/egpt-3b-preview")
    
  3. Run Inference: Modify the generation configuration and run the model with your input through standard input.

Consider using cloud GPU services like AWS EC2, Google Cloud, or Azure for efficient processing.

License

The model is available for research and educational purposes only. Users must agree to the collection and use of personal information for accessing the model. The developers reserve the right to limit or revoke access if legal or social issues arise.

More Related APIs in Text Generation