Vikhr Nemo 12 B Instruct R 21 09 24
VikhrmodelsIntroduction
Vikhr-Nemo-12B-Instruct-R-21-09-24 is a flagship unimodal Large Language Model (LLM) developed by VikhrModels. It is an enhanced version of the mistralai/Mistral-Nemo-Instruct-2407 model, optimized for Russian and English. The model supports reasoning, summarization, coding, roleplay, dialogue maintenance, and multi-language generation with high-performance RAG capabilities.
Architecture
The model is designed to handle up to 128k tokens in context and features a Grounded RAG mode for document-based question answering. It is built using the transformers library and supports text generation in multiple languages, including English and Russian.
Training
The training process involved several stages, including Supervised Fine-Tuning (SFT) and a custom alignment phase with SMPO. A synthetic dataset of 150k instructions, Vikhrmodels/GrandMaster-PRO-MAX, was used alongside a separate dataset for RAG grounding, Vikhrmodels/Grounded-RAG-RU-v2. The model underwent further quality improvement through a custom Reward Model and Rejection Sampling.
Guide: Running Locally
- Setup Environment: Install the necessary dependencies using Hugging Face's transformers library.
- Load Model: Use the
vllm
server to run the model with the command:vllm serve --dtype half --max-model-len 32000 -tp 1 Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24 --api-key token-abc123
- Use Appropriate Prompts: Configure system prompts for specific tasks like RAG.
- Deployment: Consider using cloud GPUs for efficient deployment, such as AWS EC2 or Google Cloud Platform.
License
This model is licensed under the Apache 2.0 license.