orca_mini_v9_1_1 B Instruct

pankajmathur

Introduction

The ORCA_MINI_V9_1_LLAMA-3.2-1B-INSTRUCT model is a text generation model based on the Llama-3.2-1B-Instruct architecture. It has been trained using several fine-tuning datasets, and it is designed to assist as a conversational AI. The model is versatile and encourages customization and enhancement by users.

Architecture

The model is based on the meta-llama/Llama-3.2-1B-Instruct architecture. It utilizes the Transformers library and supports text generation tasks. The model is designed to run efficiently in various formats, including half precision (bfloat16), 4-bit, and 8-bit configurations using the bitsandbytes library for quantization.

Training

The ORCA_MINI_V9_1_LLAMA-3.2-1B-INSTRUCT model was trained using multiple datasets, including pankajmathur/orca_mini_v1_dataset and pankajmathur/orca_mini_v8_sharegpt_format. The training process focuses on providing a general-purpose model that is safe and robust for various applications. The model includes safety fine-tuning and employs a multi-faceted approach to data collection, combining human-generated and synthetic data to ensure quality and mitigate risks.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies:

    • Install the Transformers library and PyTorch.
    • Optionally, install the bitsandbytes library for quantization.
  2. Set Up the Environment:

    • Ensure you have a compatible GPU. Consider using cloud GPUs like those from AWS, Google Cloud, or Azure for better performance.
  3. Load the Model:

    • Use the provided code snippets to load the model for text generation in the desired format (half precision, 4-bit, or 8-bit).
  4. Generate Text:

    • Execute the code to generate text based on input prompts.

Example code for half precision:

import torch
from transformers import pipeline

model_slug = "pankajmathur/orca_mini_v9_1_1B-Instruct"
pipeline = pipeline(
    "text-generation",
    model=model_slug,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are Orca Mini, a helpful AI assistant."},
    {"role": "user", "content": "Hello Orca Mini, what can you do for me?"}
]
outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.01, top_k=100, top_p=0.95)
print(outputs[0]["generated_text"][-1])

License

The model is released under the llama3.2 license, which allows for foundational use, fine-tuning, and customization. Users must provide proper credit and attribution when using the model.

More Related APIs in Text Generation