Llama D N A 1.0 8 B Instruct
dnotitiaIntroduction
DNA 1.0 8B Instruct is a state-of-the-art bilingual language model developed by Dnotitia Inc. It is based on the Llama architecture and optimized for both Korean and English language understanding and generation. The model is designed for various NLP tasks, including conversation and chat, with strong capabilities in instruction-following.
Architecture
This model utilizes a sophisticated merging process involving spherical linear interpolation (SLERP) with Llama 3.1 8B Instruct and knowledge distillation using the Llama 3.1 405B as the teacher model. It includes continual pre-training with a high-quality Korean dataset, supervised fine-tuning, and direct preference optimization to enhance instruction-following abilities. The model boasts a vocab size of 128,256 and a context length of 131,072 tokens.
Training
DNA 1.0 8B Instruct was fine-tuned on approximately 10 billion tokens of curated data. It underwent extensive instruction tuning to enhance its ability to follow complex instructions and engage in natural conversations. The evaluation of the model was conducted against other prominent models in various benchmarks, achieving top scores in several Korean-specific and general language understanding tasks.
Guide: Running Locally
To run DNA 1.0 8B Instruct locally, follow these steps:
- Install the
transformers
library version 4.43.0 or higher. - Load the model and tokenizer using the
AutoModelForCausalLM
andAutoTokenizer
classes from thetransformers
library. - Create a conversation template and generate responses using the model.
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
tokenizer = AutoTokenizer.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct')
model = AutoModelForCausalLM.from_pretrained('dnotitia/Llama-DNA-1.0-8B-Instruct', device_map='auto')
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
conversation = [
{"role": "system", "content": "You are a helpful assistant, Dnotitia DNA."},
{"role": "user", "content": "너의 이름은?"},
]
inputs = tokenizer.apply_chat_template(conversation,
add_generation_prompt=True,
return_dict=True,
return_tensors="pt").to(model.device)
_ = model.generate(**inputs, streamer=streamer)
For optimal performance, it is recommended to use cloud GPUs such as those offered by AWS, Google Cloud, or Azure.
License
This model is released under the CC BY-NC 4.0 license. For commercial use, please contact Dnotitia Inc. through their contact form.