Codestral 22 B v0.1

mistralai

Introduction

Codestral-22B-v0.1 is an advanced model by Mistral AI, designed for text generation with a focus on programming languages. It supports over 80 languages, including popular ones like Python, Java, and C++. The model is versatile, providing functionalities such as instructive responses and Fill-in-the-Middle (FIM) predictions, making it suitable for tasks like code documentation and development.

Architecture

Codestral-22B-v0.1 utilizes Mistral's unique tokenization and inference mechanisms. It employs the MistralTokenizer for encoding and decoding, and supports integration with Hugging Face's Transformers library for broader compatibility.

Training

The model is trained on a large dataset encompassing over 80 programming languages. It is designed to handle diverse tasks like answering questions about code snippets, generating code, and predicting code segments in FIM mode. More details on its training can be found in Mistral AI's blog post.

Guide: Running Locally

To run Codestral-22B-v0.1 locally, follow these steps:

  1. Installation:
    Install the necessary package using:

    pip install mistral_inference
    
  2. Download Model Files:
    Use the following script to download the required files:

    from huggingface_hub import snapshot_download
    from pathlib import Path
    
    mistral_models_path = Path.home().joinpath('mistral_models', 'Codestral-22B-v0.1')
    mistral_models_path.mkdir(parents=True, exist_ok=True)
    
    snapshot_download(repo_id="mistralai/Codestral-22B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)
    
  3. Inference with Mistral:
    Load the model and generate outputs:

    from mistral_inference.transformer import Transformer
    from mistral_inference.generate import generate
    
    model = Transformer.from_folder(mistral_models_path)
    out_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
    
    result = tokenizer.decode(out_tokens[0])
    print(result)
    
  4. Inference with Transformers:
    Ensure the Transformers library is installed:

    pip install -U transformers
    

    Use the following code:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_id = "mistralai/Codestral-22B-v0.1"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    
    model = AutoModelForCausalLM.from_pretrained(model_id)
    model.to("cuda")
    
    text = "Hello my name is"
    inputs = tokenizer(text, return_tensors="pt")
    
    outputs = model.generate(**inputs, max_new_tokens=20)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    
  5. Suggested Environment:
    For optimal performance, especially with large models, consider using cloud GPUs to handle the computational requirements.

License

Codestral-22B-v0.1 is released under the MNLP-0.1 license. More details can be found on the license page.

More Related APIs in Text Generation