orca_mini_v9_5_1 B Instruct_preview
pankajmathurIntroduction
Orca_Mini_v9_5_Llama-3.2-1B-Instruct_preview is a model built on the Llama-3.2-1B architecture, designed to be a general-purpose AI assistant. It is trained using various datasets and supports fine-tuning and customization for specific applications.
Architecture
The model is based on the meta-llama/Llama-3.2-1B architecture, utilizing the transformers
library. It supports different quantization formats for efficient deployment, including half precision (bfloat16), 4-bit, and 8-bit formats.
Training
The model was trained with a combination of human-generated and synthetic data to ensure safety and effectiveness. It incorporates safety measures such as refusals and tone adjustments, and it is designed to be a safe and robust AI system.
Guide: Running Locally
- Installation: Ensure you have Python and
transformers
installed. You may also needbitsandbytes
for quantization support. - Select Model: Use the model slug
pankajmathur/orca_mini_v9_5_1B-Instruct_preview
in your code. - Code Example:
import torch from transformers import pipeline model_slug = "pankajmathur/orca_mini_v9_5_1B-Instruct_preview" pipeline = pipeline( "text-generation", model=model_slug, device_map="auto", ) messages = [ {"role": "system", "content": "You are Orca Mini, a helpful AI assistant."}, {"role": "user", "content": "Hello Orca Mini, what can you do for me?"} ] outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.01, top_k=100, top_p=0.95) print(outputs[0]["generated_text"][-1])
- Cloud GPUs: For optimal performance, consider using cloud-based GPUs from providers like AWS, Google Cloud, or Azure.
License
The model is released under the llama3.2 license. Users are encouraged to provide proper credit and attribution when using the model for further development and customization.