Llama V A R C O 8 B Instruct
NCSOFTIntroduction
The Llama-VARCO-8B-Instruct is a generative language model developed by NC Research's Language Model Team, focusing on enhancing its capabilities in both Korean and English. It utilizes advanced techniques like supervised fine-tuning and direct preference optimization to align closely with human preferences, particularly in the Korean language.
Architecture
Llama-VARCO-8B-Instruct is based on the Meta-Llama-3.1-8B architecture. It involves continual pre-training using datasets in Korean and English to ensure proficiency in both languages. This model is part of the Llama series and incorporates the extensive capabilities of the transformers library, optimized for performance in multilingual environments.
Training
The model undergoes continual pre-training and fine-tuning processes to improve its performance in Korean while maintaining English proficiency. It utilizes supervised fine-tuning (SFT) and direct preference optimization (DPO) to better align with human preferences. The training process leverages both Korean and English datasets to enhance its generative capabilities.
Guide: Running Locally
To run the Llama-VARCO-8B-Instruct model locally, you can use the following steps:
- Install Transformers Library: Ensure that you have transformers version 4.43.0 or later installed.
- Load the Model and Tokenizer:
from transformers import AutoTokenizer, AutoModelForCausalLM import torch model = AutoModelForCausalLM.from_pretrained( "NCSOFT/Llama-VARCO-8B-Instruct", torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("NCSOFT/Llama-VARCO-8B-Instruct")
- Prepare Input Messages: Create a list of messages to be processed by the model.
- Generate Output: Use the model's
generate
function to produce outputs. - Cloud GPUs: For optimal performance, consider using cloud-based GPUs from providers like AWS, Google Cloud, or Azure.
License
The Llama-VARCO-8B-Instruct model is distributed under the LLAMA 3.1 Community License Agreement. Users must adhere to this license when utilizing the model for their applications.