Mistral 7 B Instruct v0.1
mistralaiIntroduction
The Mistral-7B-Instruct-v0.1 is a large language model (LLM) developed as an instruct fine-tuned variant of the Mistral-7B-v0.1 model. It is optimized for text generation tasks and fine-tuned using various publicly available conversation datasets. Detailed information is available in the research paper and release blog post.
Architecture
The model is based on the Mistral-7B-v0.1 architecture, featuring:
- Grouped-Query Attention
- Sliding-Window Attention
- Byte-fallback BPE tokenizer
Training
Mistral-7B-Instruct-v0.1 is fine-tuned on diverse conversation datasets, enabling it to perform well in text generation tasks. The instruction format involves wrapping prompts with [INST]
and [/INST]
tokens, using a specific sentence ID structure to manage the conversation flow.
Guide: Running Locally
- Install Prerequisites: Ensure you have Python and the necessary libraries installed. Use
pip install transformers
for the latest version of the Hugging Face Transformers library. - Load the Model:
from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1") tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.1") model.to("cuda") # Load model to GPU
- Run Inference: Prepare and encode your messages using the tokenizer, then generate responses with the model.
- Cloud GPUs: For better performance, consider using cloud GPU services like AWS, GCP, or Azure.
License
The Mistral-7B-Instruct-v0.1 model is released under the Apache-2.0 license, allowing users to freely use, modify, and distribute the software in accordance with the license terms.