Llama 3.1 5 B Instruct
prithivMLmodsIntroduction
Llama-3.1-5B-Instruct is a multilingual large language model (LLM) designed for conversational tasks. It excels in multilingual dialogue, outperforming many open-source and commercial chat models on industry benchmarks.
Architecture
The model features 5 billion parameters and uses an auto-regressive transformer architecture. It is optimized for multilingual text generation, particularly in dialogue-based use cases, including tasks such as question answering, translation, and instruction following.
Training
Llama-3.1-5B-Instruct is fine-tuned using Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF). These techniques align the model with human preferences, enhancing its helpfulness, safety, and ability to generate natural conversations.
Guide: Running Locally
Requirements
-
Install the latest version of Transformers:
pip install --upgrade transformers
-
Ensure PyTorch is installed with support for bfloat16:
pip install torch
Example Code
To use the model for conversational inference:
import transformers
import torch
# Define the model ID
model_id = "prithivMLmods/Llama-3.1-5B-Instruct"
# Set up the pipeline for text generation
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto", # Use the best device available
)
# Define conversation messages
messages = [
{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
{"role": "user", "content": "Who are you?"},
]
# Generate a response
outputs = pipeline(
messages,
max_new_tokens=256,
)
# Print the generated response
print(outputs[0]["generated_text"][-1])
Cloud GPUs
To enhance performance, consider using cloud-based GPU services such as AWS, Google Cloud, or Azure to run the model.
License
The model is licensed under llama3.1. Please refer to the specific licensing terms for detailed information.