Ministral 8 B Instruct 2410
mistralaiIntroduction
The Ministral-8B-Instruct-2410 is an advanced language model developed by Mistral AI, designed for on-device computing and edge use cases. It outperforms other models of similar size and is released under the Mistral Research License. The model supports multiple languages and code data, with a focus on multilingual capabilities and function calling.
Architecture
The Ministral-8B features a dense transformer architecture with the following specifications:
- Parameters: 8,019,808,256
- Layers: 36
- Heads: 32
- Hidden Dimension: 12,288
- Vocabulary Size: 131,072
- Context Length: 128k
- Attention Pattern: Ragged (128k,32k,32k,32k)
Training
Ministral-8B is trained with a 128k context window using interleaved sliding-window attention. The model was fine-tuned with a variety of multilingual and code datasets to enhance its instruction-following capabilities.
Guide: Running Locally
Basic Steps
-
Install vLLM and Mistral Common:
pip install --upgrade vllm pip install --upgrade mistral_common
-
Set Up Model:
- Use the
vLLM
library to load the model. - Ensure you have at least 24 GB of GPU RAM. For multi-device setups, use
tensor_parallel=2
.
- Use the
-
Run Inference:
- Use the
vLLM
server-client setup for chat completions. - Alternatively, leverage
mistral-inference
for a quick setup.
- Use the
Cloud GPU Suggestions
Utilize cloud services like AWS or Google Cloud for access to high-performance GPUs, which can efficiently handle the model's requirements.
License
The Ministral-8B-Instruct-2410 is distributed under the Mistral Research License. Usage is primarily for research purposes, and commercial use requires a separate license agreement with Mistral AI. The license outlines terms for distribution, modification, and usage limitations, emphasizing non-commercial and research-oriented applications. Full license details can be found here.