Mamba Codestral 7 B v0.1
mistralaiIntroduction
The Mamba-Codestral-7B-v0.1 is an open-source code model based on the Mamba2 architecture, designed to compete with state-of-the-art Transformer-based code models. It is developed by Mistral AI and performs efficiently on various industry-standard benchmarks.
Architecture
The model is built on the Mamba2 architecture, which is tailored for code generation and related tasks. This architecture allows the model to handle complex programming queries effectively, providing a robust solution for code-based applications.
Training
Mamba-Codestral-7B-v0.1 has been evaluated on several benchmarks, including HumanEval, MBPP, Spider, and others. The model demonstrates impressive performance across various datasets, indicating its capability to generate accurate and efficient code across different programming languages and scenarios.
Guide: Running Locally
-
Installation
Begin by installing the necessary packages. You can usemistral_inference
or the originalmamba
package:pip install mistral_inference>=1 mamba-ssm causal-conv1d
or
pip install mamba_ssm causal-conv1d
-
Download the Model
Use the Hugging Face hub to download the model:from huggingface_hub import snapshot_download from pathlib import Path mistral_models_path = Path.home().joinpath('mistral_models', 'Mamba-Codestral-7B-v0.1') mistral_models_path.mkdir(parents=True, exist_ok=True) snapshot_download(repo_id="mistralai/Mamba-Codestral-7B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)
-
Run the Model
After installation, use themistral-demo
CLI to start a chat session:mistral-chat $HOME/mistral_models/Mamba-Codestral-7B-v0.1 --instruct --max_tokens 256
-
Cloud GPUs
For optimal performance, consider using cloud-based GPU services such as AWS, Google Cloud, or Azure to handle the computational requirements of the model.
License
The Mamba-Codestral-7B-v0.1 is released under the Apache 2.0 license, which allows for wide use and modification while ensuring attribution to the original creators.