orca_mini_v9_6_3 B Instruct
pankajmathurIntroduction
Orca_Mini_v9_6_Llama-3.2-3B-Instruct is a text generation model built using the Llama-3.2-3B-Instruct framework. It is intended as a general-purpose conversational AI, allowing for further customization and enhancement through various tuning processes.
Architecture
The model is based on Meta's Llama-3.2-3B-Instruct architecture, with training conducted using datasets like pankajmathur/orca_mini_v1_dataset
and pankajmathur/orca_mini_v8_sharegpt_format
. It is implemented using the transformers
library and supports PyTorch.
Training
The model is trained using a diverse set of datasets for fine-tuning, improving its conversational and inference capabilities. The training incorporates safety measures and dataset quality control to ensure robust and reliable responses.
Guide: Running Locally
To run the model locally:
-
Environment Setup:
- Ensure you have Python installed.
- Install the
transformers
,torch
, andbitsandbytes
libraries.
-
Model Loading:
- Use the
transformers
pipeline to load the model with the following code snippet in different precision formats (bfloat16, 4-bit, or 8-bit) as per your requirement.
- Use the
-
Execution:
- Define your input messages and execute the text generation pipeline.
-
Cloud GPUs:
- For enhanced performance, consider using cloud GPUs like Google Colab (with T4 GPU support).
import torch
from transformers import pipeline
model_slug = "pankajmathur/orca_mini_v9_6_3B-Instruct"
pipeline = pipeline(
"text-generation",
model=model_slug,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are Orca Mini, a helpful AI assistant."},
{"role": "user", "content": "Hello Orca Mini, what can you do for me?"}
]
outputs = pipeline(messages, max_new_tokens=128, do_sample=True, temperature=0.01, top_k=100, top_p=0.95)
print(outputs[0]["generated_text"][-1])
License
The model is provided under the llama3.2
license. Users are encouraged to credit the original authors when using the model for further development or research.