Vikhr Qwen 2.5 1.5 B Instruct
VikhrmodelsIntroduction
The Vikhr-Qwen-2.5-1.5B-Instruct is a bilingual instructive model designed for high-efficiency text processing in Russian and English. It is trained on the GrandMaster-PRO-MAX dataset, making it adept at delivering precise responses and fast task execution for various applications, including professional environments and user-facing applications.
Architecture
The model is based on the Qwen-2.5-1.5B-Instruct architecture, with a focus on bilingual support in Russian and English. It utilizes methodologies such as Supervised Fine-Tuning (SFT) and Chain-Of-Thought (CoT) to enhance its performance in instruction generation, contextual responses, and text analysis.
Training
The Vikhr-Qwen-2.5-1.5B-Instruct model was developed using the SFT method on a synthetic dataset, GrandMaster-PRO-MAX, consisting of 150,000 instructions. The training process incorporated CoT methodology and GPT-4-turbo prompts to achieve high accuracy and coherence in responses.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Transformers Library: Ensure you have the
transformers
library installed.pip install transformers
-
Load the Model and Tokenizer:
from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "Vikhrmodels/Vikhr-Qwen-2.5-1.5B-Instruct" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)
-
Prepare Input and Generate Output:
input_text = "Напиши краткое описание книги Гарри Поттер." messages = [ {"role": "system", "content": "Вы — Vikhr, ИИ помощник, созданный компанией Vikhr models для предоставления полезной, честной и безопасной информации."}, {"role": "user", "content": input_text}, ] input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt") output = model.generate(input_ids, max_length=1512, temperature=0.3, num_return_sequences=1, no_repeat_ngram_size=2, top_k=50, top_p=0.95) generated_text = tokenizer.decode(output[0], skip_special_tokens=True) print(generated_text)
-
Consider Cloud GPUs: For optimal performance, especially for large models like this, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
The Vikhr-Qwen-2.5-1.5B-Instruct model is distributed under the Apache-2.0 License. This allows for open-source use with minimal restrictions, promoting collaboration and modification.