Phi 3.5 mini instruct

microsoft

Introduction

Phi-3.5-Mini-Instruct is a lightweight, state-of-the-art language model developed by Microsoft. It is designed to handle text generation tasks across multiple languages and is part of the Phi-3 model family. The model is optimized for high-quality reasoning and instruction adherence, with a context length support of up to 128K tokens. Its primary use cases include memory and compute-constrained environments, latency-bound scenarios, and tasks requiring strong reasoning capabilities.

Architecture

Phi-3.5-Mini-Instruct is a dense decoder-only Transformer model with 3.8 billion parameters. It uses the same tokenizer as Phi-3 Mini and supports a vocabulary size of up to 32,064 tokens. The model is trained on a diverse dataset, including publicly available documents and synthetic data, to enhance reasoning capabilities. It supports languages such as Arabic, Chinese, English, French, German, and Spanish, among others.

Training

The model was trained using 512 NVIDIA H100 GPUs over a period of 10 days, processing 3.4 trillion tokens. The training data comprises high-quality educational data, synthetic textbook-like data, and filtered public documents. Fine-tuning techniques such as supervised fine-tuning, proximal policy optimization, and direct preference optimization were employed to improve instruction adherence and safety.

Guide: Running Locally

To run the Phi-3.5-Mini-Instruct model locally, follow these steps:

  1. Install Required Packages:

    pip install torch==2.3.1 transformers==4.43.0 accelerate==0.31.0 flash_attn==2.5.8
    
  2. Load the Model:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
    
    model = AutoModelForCausalLM.from_pretrained(
        "microsoft/Phi-3.5-mini-instruct", 
        device_map="cuda", 
        torch_dtype="auto", 
        trust_remote_code=True, 
    )
    tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-instruct")
    
    pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
    
  3. Generate Text:

    messages = [
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "Provide ways to eat combinations of bananas and dragonfruits."},
    ]
    
    output = pipe(messages, max_new_tokens=500, return_full_text=False, temperature=0.0, do_sample=False)
    print(output[0]['generated_text'])
    
  4. GPU Recommendations: Use cloud-based GPUs like NVIDIA A100, A6000, or H100 for optimal performance.

License

The Phi-3.5-Mini-Instruct model is released under the MIT License. This permits wide usage and modification, with an emphasis on adherence to Microsoft's Trademark & Brand Guidelines for any use of Microsoft trademarks or logos.

More Related APIs in Text Generation