solar pro preview instruct
upstageIntroduction
Solar Pro Preview is an advanced large language model (LLM) developed by Upstage, featuring 22 billion parameters optimized for deployment on a single GPU. Characterized by its superior performance, it rivals models with up to 70 billion parameters, such as the Llama 3.1. This model is designed to handle conversational and instruction-following tasks efficiently.
Architecture
Solar Pro Preview uses an enhanced depth up-scaling method to expand a 14-billion parameter Phi-3-medium model to 22 billion parameters, suitable for GPUs with 80GB VRAM. While it currently supports only English and has a maximum context length of 4K, the official release planned for November 2024 will include extended language support and longer context windows.
Training
The model's training strategy involves a curated dataset that enhances its performance on benchmarks like MMLU-Pro and IFEval. These benchmarks are crucial for assessing the model's knowledge and instruction-following capabilities.
Guide: Running Locally
-
Install Requirements:
- Ensure you have the necessary packages:
transformers==4.44.2
,torch==2.3.1
,flash_attn==2.5.8
,accelerate==0.31.0
.
- Ensure you have the necessary packages:
-
Load the Model:
import torch from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("upstage/solar-pro-preview-instruct") model = AutoModelForCausalLM.from_pretrained( "upstage/solar-pro-preview-instruct", device_map="cuda", torch_dtype="auto", trust_remote_code=True, )
-
Generate Text:
messages = [{"role": "user", "content": "Please, introduce yourself."}] prompt = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) outputs = model.generate(prompt, max_new_tokens=512) print(tokenizer.decode(outputs[0]))
- Suggested Cloud GPUs: To efficiently run Solar Pro Preview, consider using cloud services offering NVIDIA GPUs with at least 80GB VRAM.
License
Solar Pro Preview is released under the MIT License. For more details, refer to the license link.