orca_mini_v9_5_1 B Instruct_preview LLM Model

Introduction

Orca_Mini_v9_5_Llama-3.2-1B-Instruct_preview is a model built on the Llama-3.2-1B architecture, designed to be a general-purpose AI assistant. It is trained using various datasets and supports fine-tuning and customization for specific applications.

Architecture

The model is based on the meta-llama/Llama-3.2-1B architecture, utilizing the transformers library. It supports different quantization formats for efficient deployment, including half precision (bfloat16), 4-bit, and 8-bit formats.

Training

The model was trained with a combination of human-generated and synthetic data to ensure safety and effectiveness. It incorporates safety measures such as refusals and tone adjustments, and it is designed to be a safe and robust AI system.

Guide: Running Locally

Installation: Ensure you have Python and transformers installed. You may also need bitsandbytes for quantization support.
Select Model: Use the model slug pankajmathur/orca_mini_v9_5_1B-Instruct_preview in your code.

Code Example:

import torch
from transformers import pipeline

model_slug = "pankajmathur/orca_mini_v9_5_1B-Instruct_preview"
pipeline = pipeline(
    "text-generation",
    model=model_slug,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are Orca Mini, a helpful AI assistant."},
    {"role": "user", "content": "Hello Orca Mini, what can you do for me?"}
]
outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.01, top_k=100, top_p=0.95)
print(outputs[0]["generated_text"][-1])

Cloud GPUs: For optimal performance, consider using cloud-based GPUs from providers like AWS, Google Cloud, or Azure.

License

The model is released under the llama3.2 license. Users are encouraged to provide proper credit and attribution when using the model for further development and customization.

More Related APIs in Text Generation