E X A O N E 3.5 7.8 B Instruct G G U F

QuantFactory

Introduction

EXAONE 3.5 is a series of instruction-tuned bilingual (English and Korean) generative models developed by LG AI Research. These models range from 2.4B to 32B parameters and include a 7.8B model that offers improved performance over its predecessor. The models support long-context processing up to 32K tokens and demonstrate state-of-the-art performance in real-world use cases.

Architecture

The EXAONE 3.5-7.8B model has 6.98 billion parameters (excluding embeddings), 32 layers, and uses GQA with 32 Q-heads and 8 KV-heads. The vocabulary size is 102,400, and it can process contexts up to 32,768 tokens long.

Training

The language models were trained to utilize a system prompt and are optimized for both general and specific domain use cases. They include pre-quantized versions in AWQ and other quantization types.

Guide: Running Locally

To run the EXAONE 3.5-7.8B model locally, you need transformers v4.43 or later. Here is a brief code snippet to start:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Explain how wonderful you are"  # English example
messages = [
    {"role": "system", 
     "content": "You are EXAONE model from LG AI Research, a helpful assistant."},
    {"role": "user", "content": prompt}
]
input_ids = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

output = model.generate(
    input_ids.to("cuda"),
    eos_token_id=tokenizer.eos_token_id,
    max_new_tokens=128,
    do_sample=False,
)
print(tokenizer.decode(output[0]))

For efficient execution, consider using cloud GPUs such as AWS, Google Cloud, or Azure.

License

The model is distributed under the EXAONE AI Model License Agreement 1.1 - NC. For full license details, refer to the LICENSE file in the repository.

More Related APIs in Text Generation