orca_mini_v9_5_1 B Instruct LLM Model

Introduction

Orca_Mini_v9_5_Llama-3.2-1B-Instruct is a text generation model developed on the Llama-3.2 framework, designed to serve as a versatile AI assistant. It is optimized for safety and effectiveness in various text generation tasks and supports further customization to fit specific needs.

Architecture

The model is based on Llama-3.2-1B, a derivative of the larger Llama-3.2-3B. It utilizes various SFT datasets to enhance its instruct capability, allowing it to perform a wide range of conversational tasks efficiently. It supports several quantization configurations, including 4-bit and 8-bit formats, for optimized performance on different hardware setups.

Training

The model was fine-tuned using a combination of human-generated and synthetic data. This approach ensures high-quality responses while mitigating potential safety risks. The fine-tuning process emphasizes safe interaction, refusal handling, and appropriate response tones to adversarial prompts.

Guide: Running Locally

To run the model locally:

Install the required libraries:
- transformers for model loading and inference.
- bitsandbytes for quantization support.
Set up the model pipeline:
- Use the pipeline API from transformers to load the model.
- Configure the model to run in the desired precision (e.g., bfloat16, 4-bit, or 8-bit).

Example Code:

import torch
from transformers import pipeline

model_slug = "pankajmathur/orca_mini_v9_5_1B-Instruct"
pipeline = pipeline(
    "text-generation",
    model=model_slug,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are Orca Mini, a helpful AI assistant."},
    {"role": "user", "content": "Hello Orca Mini, what can you do for me?"}
]
outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.01, top_k=100, top_p=0.95)
print(outputs[0]["generated_text"][-1])

Cloud GPUs:
- For efficient performance, consider using cloud GPUs like Google Colab with a T4 GPU.

License

The model is released under the llama3.2 license. Users are encouraged to credit appropriately and are permitted to adapt the model for further fine-tuning or specific use cases.

More Related APIs in Text Generation