orca_mini_v9_2_70b
pankajmathurIntroduction
Orca_Mini_V9_2_Llama-3.3-70B-Instruct is a text-generation model trained on various SFT datasets using the Llama-3.3-70B-Instruct framework. It is designed to be a flexible, general-purpose model that users can fine-tune and enhance for specific needs.
Architecture
The model is based on the Llama-3.3-70B-Instruct architecture. It utilizes the Transformers library and supports deployment using PyTorch and Safetensors. The model is trained with English language data and is open to further customization and enhancement by users.
Training
The model is trained with a combination of human-generated and synthetic data, employing a multifaceted approach to enhance safety and quality. The training strategy emphasizes safety fine-tuning to provide a robust, safe, and powerful model for various applications, reducing the workload for developers deploying safe AI systems.
Guide: Running Locally
-
Install Dependencies: Ensure you have PyTorch and the Transformers library installed in your environment.
-
Set Up the Model:
import torch from transformers import pipeline model_slug = "pankajmathur/orca_mini_v9_2_70b" pipeline = pipeline( "text-generation", model=model_slug, device_map="auto", )
-
Run Inference:
- Use the model with default precision (bfloat16) or quantize to 4-bit or 8-bit using the
BitsAndBytesConfig
from thebitsandbytes
library for efficiency.
- Use the model with default precision (bfloat16) or quantize to 4-bit or 8-bit using the
-
Example Code:
messages = [ {"role": "system", "content": "You are Orca Mini, a helpful AI assistant."}, {"role": "user", "content": "Hello Orca Mini, what can you do for me?"} ] outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.01, top_k=100, top_p=0.95) print(outputs[0]["generated_text"][-1])
-
Recommended Cloud GPUs: For optimal performance, using cloud GPUs such as those from AWS, Google Cloud, or Azure is advisable.
License
The model is released under the llama3.3 license, allowing users to utilize it as a foundational base for further fine-tuning and customization. Users must provide proper credit and attribution when using the model.