Meta Llama 3 8 B Instruct
meta-llamaIntroduction
The Meta-Llama 3-8B-Instruct is part of the Llama 3 family of large language models developed by Meta. Designed for text generation and conversational applications, it is optimized for dialogue use cases and outperforms various open-source chat models on industry benchmarks. The model comes in two sizes (8B and 70B parameters) and employs advanced techniques like Supervised Fine-Tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) for alignment with human preferences.
Architecture
Llama 3 is an auto-regressive language model utilizing an optimized transformer architecture. It focuses on generating text and code from input text. The models are available in pre-trained and instruction-tuned variants to accommodate diverse applications, with the instruction-tuned models being particularly optimized for conversational tasks.
Training
The Llama 3 models were pretrained using over 15 trillion tokens from publicly available data sources. The training infrastructure included Meta's Research SuperCluster and third-party cloud compute resources. The 8B and 70B models used Grouped-Query Attention (GQA) for enhanced inference scalability. The models underwent extensive fine-tuning with over 10 million human-annotated examples.
Guide: Running Locally
To run Meta-Llama-3-8B-Instruct locally, you can use the following methods:
Using Transformers
- Install Dependencies: Ensure you have
transformers
andtorch
installed. - Load Model: Use the Transformers library to load the model.
- Generate Text: Utilize the
pipeline
orAutoModelForCausalLM
classes to interact with the model.
import transformers
pipeline = transformers.pipeline("text-generation", model="meta-llama/Meta-Llama-3-8B-Instruct", device_map="auto")
output = pipeline("Hello, how are you?", max_new_tokens=50)
print(output)
- Inference Parameters: Adjust parameters like
max_new_tokens
,temperature
, andtop_p
for different outputs.
Using Llama3 Codebase
- Download Model: Utilize
huggingface-cli
to download the original model checkpoints. - Run Inference: Follow the instructions in the Llama3 GitHub repository for detailed steps.
Cloud GPUs
For optimal performance, consider using cloud GPU services like AWS, Google Cloud, or Azure.
License
The Meta-Llama 3 models are available under a custom community license. The license grants a non-exclusive, worldwide, non-transferable, and royalty-free limited license for usage, reproduction, and modification of the models. Certain conditions apply, such as providing attribution and compliance with the Acceptable Use Policy. For organizations with over 700 million monthly active users, additional commercial terms may apply. The full license details can be found on the official Meta Llama 3 License Page.