saiga_llama3_8b LLM Model

Introduction

SAIGA/LLAMA3 8B is a Russian-language chatbot based on the Llama-3 8B model. It is designed for conversational use and text generation. The model employs a specific prompt format and provides sample interaction outputs to demonstrate its capabilities.

Architecture

The model leverages the Llama-3 8B architecture, known for its capabilities in text generation and conversational AI. It is compatible with transformers and safetensors libraries and employs a specific prompt format, switching from ChatML to the original Llama-3 format in version 4.

Training

The SAIGA/LLAMA3 8B model has undergone several training iterations, each identified by a version number. It utilizes datasets such as saiga_scored and configurations for Supervised Fine-Tuning (SFT) and Knowledge Transfer Optimization (KTO). Training runs and configurations are documented through linked Weights & Biases (wandb) sessions. The model has been evaluated using a framework based on AlpacaEval.

Guide: Running Locally

Setup: Use the transformers library to load the SAIGA/LLAMA3 8B model and tokenizer.
Configuration: Employ the GenerationConfig for model generation settings.
Execution: Use the provided Python code to generate text outputs based on user inputs.
Hardware Recommendations: For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

Example code snippet for local execution:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

MODEL_NAME = "IlyaGusev/saiga_llama3_8b"
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME, load_in_8bit=True, torch_dtype=torch.bfloat16, device_map="auto")
model.eval()
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

License

The SAIGA/LLAMA3 8B model is distributed under the Llama3 license. For detailed terms and conditions, refer to the license link.

More Related APIs in Text Generation