Llama Thinker 3 B Preview G G U F
prithivMLmodsLlama-Thinker-3B-Preview-GGUF
Introduction
Llama-Thinker-3B-Preview-GGUF is a pretrained and instruction-tuned generative model designed for multilingual applications. The model is trained using synthetic datasets with long chains of thought, enabling it to perform complex reasoning tasks effectively. It is built on the Llama 3.2 architecture and fine-tuned for alignment with human preferences.
Architecture
The model utilizes an autoregressive language model based on the optimized transformer architecture. It undergoes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to enhance its alignment with human preferences for helpfulness and safety.
Training
This model is trained using synthetic datasets that encourage long chains of thought, allowing it to tackle complex reasoning tasks. The training process includes supervised fine-tuning and reinforcement learning with human feedback to align the model with human preferences.
Guide: Running Locally
Example 1: Running with Ollama
Step 1: Download the Model
Use the command below to download the model:
ollama run llama-thinker-3b-preview.gguf
Step 2: Model Initialization and Download
Ollama will initialize and download the necessary files. You will see progress output as the model files are pulled and verified.
Step 3: Interact with the Model
Once loaded, you can interact with the model by sending prompts. For example:
>>> How can you assist me today?
Step 4: Exit the Program
Exit by typing:
/exit
Notes on Using Quantized Models
- VRAM/CPU Requirements: Ensure your system has adequate resources.
- Model Format: Use the .gguf format for compatibility.
Cloud GPUs
For enhanced performance, consider using cloud-based GPUs such as AWS EC2, Google Cloud, or Azure.
License
The Llama-Thinker-3B-Preview-GGUF model is released under the creativeml-openrail-m
license. This allows for creative use with adherence to open-source guidelines.