Phi 3 mini 4k instruct
microsoftPhi-3-Mini-4K-Instruct Model Summary
Introduction
The Phi-3-Mini-4K-Instruct is a lightweight, state-of-the-art open model with 3.8 billion parameters. It is part of the Phi-3 family and is available in two variants, 4K and 128K context lengths. The model is designed for text generation and has undergone supervised fine-tuning and direct preference optimization to enhance instruction-following and safety.
Architecture
- Parameters: 3.8 billion
- Type: Dense, decoder-only Transformer
- Context Length: Supports up to 4K tokens
- Training GPUs: 512 H100-80G
- Training Duration: 10 days
Training
- Data: 4.9 trillion tokens from a mix of publicly available documents and synthetic data.
- Fine-tuning: Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO).
- Performance: Demonstrates strong reasoning capabilities in benchmarks against models with fewer than 13 billion parameters.
Guide: Running Locally
- Environment Setup: Ensure you have the following packages:
flash_attn==2.5.8
torch==2.3.1
accelerate==0.31.0
transformers==4.41.2
- Sample Code:
import torch from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline model = AutoModelForCausalLM.from_pretrained( "microsoft/Phi-3-mini-4k-instruct", device_map="cuda", torch_dtype="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct") pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) messages = [{"role": "system", "content": "You are a helpful AI assistant."}] output = pipe(messages, max_new_tokens=500, return_full_text=False, temperature=0.0, do_sample=False) print(output[0]['generated_text'])
- Suggested Cloud GPUs: NVIDIA A100, A6000, or H100 for optimal performance. For older GPUs like NVIDIA V100, use
attn_implementation="eager"
.
License
The Phi-3-Mini-4K-Instruct model is released under the MIT License. Use of trademarks and logos is subject to Microsoft's guidelines and third-party policies where applicable.