Vikhr Llama 3.2 1 B Instruct LLM Model

Introduction

The Vikhr-Llama-3.2-1B-Instruct is an instructive language model based on the Llama-3.2-1B architecture. It is specifically designed for processing the Russian language and trained on the GrandMaster-PRO-MAX dataset. The model is highly efficient, being five times more efficient than its base model, making it suitable for deployment on low-power and mobile devices.

Architecture

This model is a variant of the Llama-3.2-1B-Instruct, optimized for Russian language processing. It maintains a compact size under 3GB, significantly enhancing its suitability for use in resource-constrained environments.

Training

The Vikhr-Llama-3.2-1B-Instruct was developed using the Supervised Fine-Tuning (SFT) method. It was trained on a synthetic dataset containing 150k instructions, supporting the Chain-Of-Thought (CoT) approach with prompts for GPT-4-turbo. The training scripts are available in the GitHub repository effective_llm_alignment.

Guide: Running Locally

Install Required Libraries: Ensure transformers and torch libraries are installed in your environment.

Load the Model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Vikhrmodels/Vikhr-Llama-3.2-1B-instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Prepare Input Text:

input_text = "Напиши очень краткую рецензию о книге гарри поттер."
input_ids = tokenizer.encode(input_text, return_tensors="pt")

Generate Text:

output = model.generate(
  input_ids,
  max_length=1512,
  temperature=0.3,
  num_return_sequences=1,
  no_repeat_ngram_size=2,
  top_k=50,
  top_p=0.95,
)

Decode and Print Output:

generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

Cloud GPUs: For enhanced performance, consider using cloud GPU services like Google Colab or AWS EC2.

License

The Vikhr-Llama-3.2-1B-Instruct model is released under the llama3.2 license.

More Related APIs in Text Generation