Llama 3.2 3 B
meta-llamaIntroduction
Llama 3.2 is a collection of multilingual large language models developed by Meta. It is optimized for text-generation tasks in multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. The models are available in 1B and 3B sizes and are designed for applications such as dialogue systems, summarization, and agentic retrieval.
Architecture
Llama 3.2 utilizes an optimized transformer architecture in an auto-regressive language model setup. The models are instruction-tuned and use supervised fine-tuning (SFT) along with reinforcement learning with human feedback (RLHF) to ensure alignment with human preferences for helpfulness and safety. The quantization scheme includes 4-bit groupwise quantization for weights and 8-bit dynamic quantization for activations in linear layers, which enhances inference performance and reduces model size.
Training
Llama 3.2 was trained using up to 9 trillion tokens from publicly available data, with a knowledge cutoff of December 2023. The training involved custom GPU clusters and production infrastructure, utilizing 916k GPU hours on H100-80GB hardware. Meta maintains net-zero greenhouse gas emissions, achieving zero market-based greenhouse gas emissions for training. The training process included knowledge distillation, rejection sampling, and direct preference optimization to refine model performance.
Guide: Running Locally
-
Install Prerequisites:
- Ensure Python and pip are installed.
- Install the Hugging Face Transformers library:
pip install --upgrade transformers
-
Load the Model:
- Use the Transformers pipeline for text generation:
import torch from transformers import pipeline model_id = "meta-llama/Llama-3.2-3B" pipe = pipeline("text-generation", model=model_id, torch_dtype=torch.bfloat16, device_map="auto") result = pipe("The key to life is")
- Use the Transformers pipeline for text generation:
-
Download Checkpoints:
- Use the Hugging Face CLI to download original checkpoints:
huggingface-cli download meta-llama/Llama-3.2-3B --include "original/*" --local-dir Llama-3.2-3B
- Use the Hugging Face CLI to download original checkpoints:
-
Hardware Recommendations:
- For optimal performance, consider using cloud GPUs such as NVIDIA's A100 or H100.
License
Llama 3.2 is distributed under the Llama 3.2 Community License, which grants a non-exclusive, worldwide, non-transferable, royalty-free license. Users must comply with the Acceptable Use Policy and display appropriate attributions, such as "Built with Llama," when distributing products that incorporate Llama Materials. Commercial terms apply for entities with over 700 million monthly active users. The full license and acceptable use policy can be found on Meta's documentation pages.