Llama Thinker 3 B Preview2
prithivMLmodsIntroduction
Llama-Thinker-3B-Preview2 is a pretrained and instruction-tuned generative model designed for multilingual applications. It is capable of performing complex reasoning tasks effectively, utilizing long chains of thought.
Architecture
The model is based on Llama 3.2, an autoregressive language model that uses an optimized transformer architecture. It undergoes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training
Llama-Thinker-3B-Preview2 is trained using synthetic datasets to enhance its reasoning capabilities. It is tailored for multilingual tasks and complex reasoning, making it versatile for various applications.
Guide: Running Locally
Running with Transformers
To use the model with Transformers, ensure you have version 4.43.0 or later. Update your installation with:
pip install --upgrade transformers
Execute the following Python script to run the model:
import torch
from transformers import pipeline
model_id = "prithivMLmods/Llama-Thinker-3B-Preview2"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
outputs = pipe(messages, max_new_tokens=256)
print(outputs[0]["generated_text"][-1])
Running with Ollama
-
Download the Model:
Run the command to download the model:ollama run llama-thinker-3b-preview2.gguf
-
Initialize and Download:
Ollama will initialize and download necessary files. -
Interact with the Model:
After loading, interact by sending prompts. -
Exit the Program:
Type/exit
to quit.
Cloud GPUs
For enhanced performance, consider using cloud GPUs provided by services like AWS, Google Cloud, or Azure.
License
The Llama-Thinker-3B-Preview2 model is licensed under the CreativeML OpenRAIL M license.