E X A O N E 3.5 7.8 B Instruct G G U F
QuantFactoryIntroduction
EXAONE 3.5 is a series of instruction-tuned bilingual (English and Korean) generative models developed by LG AI Research. These models range from 2.4B to 32B parameters and include a 7.8B model that offers improved performance over its predecessor. The models support long-context processing up to 32K tokens and demonstrate state-of-the-art performance in real-world use cases.
Architecture
The EXAONE 3.5-7.8B model has 6.98 billion parameters (excluding embeddings), 32 layers, and uses GQA with 32 Q-heads and 8 KV-heads. The vocabulary size is 102,400, and it can process contexts up to 32,768 tokens long.
Training
The language models were trained to utilize a system prompt and are optimized for both general and specific domain use cases. They include pre-quantized versions in AWQ and other quantization types.
Guide: Running Locally
To run the EXAONE 3.5-7.8B model locally, you need transformers v4.43 or later. Here is a brief code snippet to start:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Explain how wonderful you are" # English example
messages = [
{"role": "system",
"content": "You are EXAONE model from LG AI Research, a helpful assistant."},
{"role": "user", "content": prompt}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
)
output = model.generate(
input_ids.to("cuda"),
eos_token_id=tokenizer.eos_token_id,
max_new_tokens=128,
do_sample=False,
)
print(tokenizer.decode(output[0]))
For efficient execution, consider using cloud GPUs such as AWS, Google Cloud, or Azure.
License
The model is distributed under the EXAONE AI Model License Agreement 1.1 - NC. For full license details, refer to the LICENSE file in the repository.