Vikhr Llama 3.2 1 B Instruct
VikhrmodelsIntroduction
The Vikhr-Llama-3.2-1B-Instruct is an instructive language model based on the Llama-3.2-1B architecture. It is specifically designed for processing the Russian language and trained on the GrandMaster-PRO-MAX dataset. The model is highly efficient, being five times more efficient than its base model, making it suitable for deployment on low-power and mobile devices.
Architecture
This model is a variant of the Llama-3.2-1B-Instruct, optimized for Russian language processing. It maintains a compact size under 3GB, significantly enhancing its suitability for use in resource-constrained environments.
Training
The Vikhr-Llama-3.2-1B-Instruct was developed using the Supervised Fine-Tuning (SFT) method. It was trained on a synthetic dataset containing 150k instructions, supporting the Chain-Of-Thought (CoT) approach with prompts for GPT-4-turbo. The training scripts are available in the GitHub repository effective_llm_alignment
.
Guide: Running Locally
-
Install Required Libraries: Ensure
transformers
andtorch
libraries are installed in your environment. -
Load the Model:
from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Vikhrmodels/Vikhr-Llama-3.2-1B-instruct" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
-
Prepare Input Text:
input_text = "Напиши очень краткую рецензию о книге гарри поттер." input_ids = tokenizer.encode(input_text, return_tensors="pt")
-
Generate Text:
output = model.generate( input_ids, max_length=1512, temperature=0.3, num_return_sequences=1, no_repeat_ngram_size=2, top_k=50, top_p=0.95, )
-
Decode and Print Output:
generated_text = tokenizer.decode(output[0], skip_special_tokens=True) print(generated_text)
-
Cloud GPUs: For enhanced performance, consider using cloud GPU services like Google Colab or AWS EC2.
License
The Vikhr-Llama-3.2-1B-Instruct model is released under the llama3.2
license.