Mistral Nemo Instruct 2407
mistralaiIntroduction
The Mistral-Nemo-Instruct-2407 is a large language model (LLM) developed by Mistral AI in collaboration with NVIDIA. It is an instruct fine-tuned iteration of the Mistral-Nemo-Base-2407 and is designed to outperform models of similar size. This model is optimized for text generation and supports multiple languages.
Architecture
Mistral Nemo is a transformer-based model with the following specifications:
- Layers: 40
- Dimension: 5,120
- Head dimension: 128
- Hidden dimension: 14,336
- Activation Function: SwiGLU
- Number of heads: 32
- Number of kv-heads: 8 (GQA)
- Vocabulary size: Approximately 128k
- Rotary embeddings: Theta = 1M
Training
The model is trained with a 128k context window and utilizes a mix of multilingual and code data. It serves as a drop-in replacement for the Mistral 7B model. The benchmarks for the model indicate strong performance across various tasks, including HellaSwag, Winogrande, OpenBookQA, CommonSenseQA, and others.
Guide: Running Locally
Basic Steps
-
Install Mistral Inference:
pip install mistral_inference
-
Download Model:
from huggingface_hub import snapshot_download from pathlib import Path mistral_models_path = Path.home().joinpath('mistral_models', 'Nemo-Instruct') mistral_models_path.mkdir(parents=True, exist_ok=True) snapshot_download(repo_id="mistralai/Mistral-Nemo-Instruct-2407", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)
-
Run Chat Interface:
mistral-chat $HOME/mistral_models/Nemo-Instruct --instruct --max_tokens 256 --temperature 0.35
-
Use Transformers (Optional):
For using Hugging Face transformers, install the library from source and follow the setup guidelines.
Cloud GPUs
For optimal performance, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure.
License
The Mistral-Nemo-Instruct-2407 model is released under the Apache 2.0 License, allowing for free use and distribution with proper attribution.