saiga_llama3_8b
IlyaGusevIntroduction
SAIGA/LLAMA3 8B is a Russian-language chatbot based on the Llama-3 8B model. It is designed for conversational use and text generation. The model employs a specific prompt format and provides sample interaction outputs to demonstrate its capabilities.
Architecture
The model leverages the Llama-3 8B architecture, known for its capabilities in text generation and conversational AI. It is compatible with transformers and safetensors libraries and employs a specific prompt format, switching from ChatML to the original Llama-3 format in version 4.
Training
The SAIGA/LLAMA3 8B model has undergone several training iterations, each identified by a version number. It utilizes datasets such as saiga_scored
and configurations for Supervised Fine-Tuning (SFT) and Knowledge Transfer Optimization (KTO). Training runs and configurations are documented through linked Weights & Biases (wandb) sessions. The model has been evaluated using a framework based on AlpacaEval.
Guide: Running Locally
- Setup: Use the
transformers
library to load the SAIGA/LLAMA3 8B model and tokenizer. - Configuration: Employ the
GenerationConfig
for model generation settings. - Execution: Use the provided Python code to generate text outputs based on user inputs.
- Hardware Recommendations: For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.
Example code snippet for local execution:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
MODEL_NAME = "IlyaGusev/saiga_llama3_8b"
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, load_in_8bit=True, torch_dtype=torch.bfloat16, device_map="auto")
model.eval()
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
License
The SAIGA/LLAMA3 8B model is distributed under the Llama3 license. For detailed terms and conditions, refer to the license link.