Vikhr Llama 3.2 1 B Instruct

Vikhrmodels

Introduction

The Vikhr-Llama-3.2-1B-Instruct is an instructive language model based on the Llama-3.2-1B architecture. It is specifically designed for processing the Russian language and trained on the GrandMaster-PRO-MAX dataset. The model is highly efficient, being five times more efficient than its base model, making it suitable for deployment on low-power and mobile devices.

Architecture

This model is a variant of the Llama-3.2-1B-Instruct, optimized for Russian language processing. It maintains a compact size under 3GB, significantly enhancing its suitability for use in resource-constrained environments.

Training

The Vikhr-Llama-3.2-1B-Instruct was developed using the Supervised Fine-Tuning (SFT) method. It was trained on a synthetic dataset containing 150k instructions, supporting the Chain-Of-Thought (CoT) approach with prompts for GPT-4-turbo. The training scripts are available in the GitHub repository effective_llm_alignment.

Guide: Running Locally

  1. Install Required Libraries: Ensure transformers and torch libraries are installed in your environment.

  2. Load the Model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = "Vikhrmodels/Vikhr-Llama-3.2-1B-instruct"
    model = AutoModelForCausalLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
  3. Prepare Input Text:

    input_text = "Напиши очень краткую рецензию о книге гарри поттер."
    input_ids = tokenizer.encode(input_text, return_tensors="pt")
    
  4. Generate Text:

    output = model.generate(
      input_ids,
      max_length=1512,
      temperature=0.3,
      num_return_sequences=1,
      no_repeat_ngram_size=2,
      top_k=50,
      top_p=0.95,
    )
    
  5. Decode and Print Output:

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    print(generated_text)
    
  6. Cloud GPUs: For enhanced performance, consider using cloud GPU services like Google Colab or AWS EC2.

License

The Vikhr-Llama-3.2-1B-Instruct model is released under the llama3.2 license.

More Related APIs in Text Generation