Llama Thinker 3 B Preview2 LLM Model

Introduction

Llama-Thinker-3B-Preview2 is a pretrained and instruction-tuned generative model designed for multilingual applications. It is capable of performing complex reasoning tasks effectively, utilizing long chains of thought.

Architecture

The model is based on Llama 3.2, an autoregressive language model that uses an optimized transformer architecture. It undergoes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Training

Llama-Thinker-3B-Preview2 is trained using synthetic datasets to enhance its reasoning capabilities. It is tailored for multilingual tasks and complex reasoning, making it versatile for various applications.

Guide: Running Locally

Running with Transformers

To use the model with Transformers, ensure you have version 4.43.0 or later. Update your installation with:

pip install --upgrade transformers

Execute the following Python script to run the model:

import torch
from transformers import pipeline

model_id = "prithivMLmods/Llama-Thinker-3B-Preview2"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]
outputs = pipe(messages, max_new_tokens=256)
print(outputs[0]["generated_text"][-1])

Running with Ollama

Download the Model:
Run the command to download the model:
```
ollama run llama-thinker-3b-preview2.gguf
```
Initialize and Download:
Ollama will initialize and download necessary files.
Interact with the Model:
After loading, interact by sending prompts.
Exit the Program:
Type /exit to quit.

Cloud GPUs

For enhanced performance, consider using cloud GPUs provided by services like AWS, Google Cloud, or Azure.

License

The Llama-Thinker-3B-Preview2 model is licensed under the CreativeML OpenRAIL M license.

More Related APIs in Text Generation