Vikhr Nemo 12 B Instruct R 21 09 24 LLM Model

Introduction

Vikhr-Nemo-12B-Instruct-R-21-09-24 is a flagship unimodal Large Language Model (LLM) developed by VikhrModels. It is an enhanced version of the mistralai/Mistral-Nemo-Instruct-2407 model, optimized for Russian and English. The model supports reasoning, summarization, coding, roleplay, dialogue maintenance, and multi-language generation with high-performance RAG capabilities.

Architecture

The model is designed to handle up to 128k tokens in context and features a Grounded RAG mode for document-based question answering. It is built using the transformers library and supports text generation in multiple languages, including English and Russian.

Training

The training process involved several stages, including Supervised Fine-Tuning (SFT) and a custom alignment phase with SMPO. A synthetic dataset of 150k instructions, Vikhrmodels/GrandMaster-PRO-MAX, was used alongside a separate dataset for RAG grounding, Vikhrmodels/Grounded-RAG-RU-v2. The model underwent further quality improvement through a custom Reward Model and Rejection Sampling.

Guide: Running Locally

Setup Environment: Install the necessary dependencies using Hugging Face's transformers library.

Load Model: Use the vllm server to run the model with the command:

vllm serve --dtype half --max-model-len 32000 -tp 1 Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24 --api-key token-abc123

Use Appropriate Prompts: Configure system prompts for specific tasks like RAG.
Deployment: Consider using cloud GPUs for efficient deployment, such as AWS EC2 or Google Cloud Platform.

License

This model is licensed under the Apache 2.0 license.

More Related APIs in Text Generation