Ministral 8 B Instruct 2410

mistralai

Introduction

The Ministral-8B-Instruct-2410 is an advanced language model developed by Mistral AI, designed for on-device computing and edge use cases. It outperforms other models of similar size and is released under the Mistral Research License. The model supports multiple languages and code data, with a focus on multilingual capabilities and function calling.

Architecture

The Ministral-8B features a dense transformer architecture with the following specifications:

  • Parameters: 8,019,808,256
  • Layers: 36
  • Heads: 32
  • Hidden Dimension: 12,288
  • Vocabulary Size: 131,072
  • Context Length: 128k
  • Attention Pattern: Ragged (128k,32k,32k,32k)

Training

Ministral-8B is trained with a 128k context window using interleaved sliding-window attention. The model was fine-tuned with a variety of multilingual and code datasets to enhance its instruction-following capabilities.

Guide: Running Locally

Basic Steps

  1. Install vLLM and Mistral Common:

    pip install --upgrade vllm
    pip install --upgrade mistral_common
    
  2. Set Up Model:

    • Use the vLLM library to load the model.
    • Ensure you have at least 24 GB of GPU RAM. For multi-device setups, use tensor_parallel=2.
  3. Run Inference:

    • Use the vLLM server-client setup for chat completions.
    • Alternatively, leverage mistral-inference for a quick setup.

Cloud GPU Suggestions

Utilize cloud services like AWS or Google Cloud for access to high-performance GPUs, which can efficiently handle the model's requirements.

License

The Ministral-8B-Instruct-2410 is distributed under the Mistral Research License. Usage is primarily for research purposes, and commercial use requires a separate license agreement with Mistral AI. The license outlines terms for distribution, modification, and usage limitations, emphasizing non-commercial and research-oriented applications. Full license details can be found here.

More Related APIs