solar pro preview instruct

upstage

Introduction

Solar Pro Preview is an advanced large language model (LLM) developed by Upstage, featuring 22 billion parameters optimized for deployment on a single GPU. Characterized by its superior performance, it rivals models with up to 70 billion parameters, such as the Llama 3.1. This model is designed to handle conversational and instruction-following tasks efficiently.

Architecture

Solar Pro Preview uses an enhanced depth up-scaling method to expand a 14-billion parameter Phi-3-medium model to 22 billion parameters, suitable for GPUs with 80GB VRAM. While it currently supports only English and has a maximum context length of 4K, the official release planned for November 2024 will include extended language support and longer context windows.

Training

The model's training strategy involves a curated dataset that enhances its performance on benchmarks like MMLU-Pro and IFEval. These benchmarks are crucial for assessing the model's knowledge and instruction-following capabilities.

Guide: Running Locally

  1. Install Requirements:

    • Ensure you have the necessary packages: transformers==4.44.2, torch==2.3.1, flash_attn==2.5.8, accelerate==0.31.0.
  2. Load the Model:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("upstage/solar-pro-preview-instruct")
    model = AutoModelForCausalLM.from_pretrained(
        "upstage/solar-pro-preview-instruct",
        device_map="cuda",
        torch_dtype="auto",
        trust_remote_code=True,
    )
    
  3. Generate Text:

    messages = [{"role": "user", "content": "Please, introduce yourself."}]
    prompt = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
    outputs = model.generate(prompt, max_new_tokens=512)
    print(tokenizer.decode(outputs[0]))
    
  • Suggested Cloud GPUs: To efficiently run Solar Pro Preview, consider using cloud services offering NVIDIA GPUs with at least 80GB VRAM.

License

Solar Pro Preview is released under the MIT License. For more details, refer to the license link.

More Related APIs in Text Generation