Phi 3 medium 128k instruct LLM Model

Introduction

The Phi-3-Medium-128K-Instruct is a state-of-the-art open model with 14 billion parameters. It is part of the Phi-3 family, designed for robust performance in text generation, particularly in reasoning and understanding tasks. The model supports multilingual input and is optimized for both general-purpose AI and memory-constrained environments.

Architecture

Phi-3-Medium-128k-Instruct is a dense decoder-only Transformer model. It features a context length of 128k tokens and has been fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) for alignment with human preferences and safety measures.

Training

The training process for Phi-3-Medium-128k-Instruct involved 512 H100-80G GPUs over 42 days, using 4.8 trillion tokens. The dataset includes high-quality educational data, synthetic data, and filtered public documents to enhance reasoning capabilities. The model was trained from February to April 2024 and released in May 2024.

Guide: Running Locally

Set Up Environment:

Install the development version of transformers:

pip uninstall -y transformers
pip install git+https://github.com/huggingface/transformers

Load the Model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "microsoft/Phi-3-medium-128k-instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Run Inference:

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
messages = [{"role": "user", "content": "Example question"}]
output = pipe(messages, max_new_tokens=500, do_sample=False)
print(output[0]['generated_text'])

Hardware Recommendations:
- Use GPUs such as NVIDIA A100, A6000, or H100 for optimal performance. Consider cloud-based GPUs from providers like AWS, Google Cloud, or Azure for scalable resources.

License

The Phi-3-Medium-128K-Instruct model is licensed under the MIT License. This allows for broad reuse with minimal restrictions, provided the license text is included in all copies or substantial portions of the software.

More Related APIs in Text Generation