Mistral 7 B Instruct v0.2

mistralai

Introduction

The Mistral-7B-Instruct-v0.2 is a large language model (LLM) derived from Mistral-7B-v0.2, specifically fine-tuned for instruction-based interactions. This model is part of the Mistral AI suite, focusing on text generation tasks.

Architecture

Mistral-7B-Instruct-v0.2 features significant improvements over its predecessor, Mistral-7B-v0.1, including:

  • A 32k context window, expanded from the previous 8k.
  • Rope-theta parameter set to 1e6.
  • Removal of Sliding-Window Attention for enhanced performance.

Training

The model is fine-tuned for instruction-based tasks, requiring prompts to be enclosed in [INST] and [/INST] tokens. The initial instruction must start with a sentence ID, while subsequent instructions do not. The model generates responses until it encounters an end-of-sentence token.

Guide: Running Locally

Basic Steps

  1. Install Dependencies: Ensure you have PyTorch and the Hugging Face Transformers library installed.
  2. Load the Model:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
    tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
    
  3. Run Inference:
    • Encode your input message using the tokenizer.
    • Use the model to generate text based on your input.
    • Decode the output tokens to obtain the final text.

Cloud GPUs

For optimal performance, consider using cloud-based GPU services such as AWS EC2, Google Cloud Platform, or Azure to handle the computational requirements of running this model.

License

The Mistral-7B-Instruct-v0.2 model is released under the Apache 2.0 license, allowing for both personal and commercial use with minimal restrictions.

More Related APIs in Text Generation