Mixtral 8x22 B Instruct v0.1
mistralaiMixtral-8x22B-Instruct-v0.1 Model Documentation
Introduction
Mixtral-8x22B-Instruct-v0.1 is a large language model (LLM) developed by Mistral AI. It is an instruct fine-tuned version of the Mixtral-8x22B-v0.1 model, designed for enhanced performance in text generation tasks across multiple languages, including English, Spanish, Italian, German, and French.
Architecture
The model employs transformers and is compatible with the Safetensors format. It has been optimized for various uses, including text generation and conversational applications, with inference endpoints available for deployment. The model leverages advanced protocols for encoding and decoding using Mistral's own tokenizers.
Training
Mixtral-8x22B-Instruct-v0.1 has been fine-tuned to handle complex instructions and generate text effectively. The training incorporates advanced tool use and function calling capabilities, enabling the model to perform tasks like weather retrieval through structured function calls.
Guide: Running Locally
-
Setup Environment: Install the required packages using pip.
pip install transformers torch mistral-common
-
Load the Model: Utilize Hugging Face's
transformers
library.from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_id = "mistralai/Mixtral-8x22B-Instruct-v0.1" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto") model.to("cuda")
-
Prepare Inputs: Use the tokenizer to format and tokenize your inputs.
chat = [{"role": "user", "content": "Explain Machine Learning to me in a nutshell."}] tokens = tokenizer.apply_chat_template(chat, return_dict=True, return_tensors="pt", add_generation_prompt=True)
-
Generate Text: Run inference to generate responses.
generated_ids = model.generate(**tokens, max_new_tokens=1000, do_sample=True) result = tokenizer.decode(generated_ids[0]) print(result)
Cloud GPUs
For optimal performance, especially with large models like Mixtral-8x22B-Instruct-v0.1, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure. This ensures sufficient computational resources and reduces latency during inference.
License
Mixtral-8x22B-Instruct-v0.1 is licensed under the Apache-2.0 License, allowing for both personal and commercial use with minimal restrictions. For more details, refer to the license documentation.