Llama 3.1 405 B Instruct
meta-llamaIntroduction
Llama 3.1 is a collection of large language models developed by Meta, optimized for multilingual dialogue and instruction-tuned for enhanced performance in text generation tasks. The models are available in 8B, 70B, and 405B parameter sizes and are designed to support commercial and research applications in multiple languages.
Architecture
Llama 3.1 models utilize an auto-regressive language model structure featuring an optimized transformer architecture. The models employ supervised fine-tuning and reinforcement learning with human feedback to align with human preferences for helpfulness and safety. Supported languages include English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
Training
The Llama 3.1 models were pretrained on approximately 15 trillion tokens of publicly available data and fine-tuned using instruction datasets and synthetic examples. Training was conducted on Meta's custom-built GPU cluster, utilizing 39.3 million GPU hours. Meta maintains net-zero greenhouse gas emissions, resulting in 0 tons CO2eq market-based emissions during training.
Guide: Running Locally
-
Setup Environment: Ensure you have Python and PyTorch installed. Install the Transformers library from Hugging Face.
-
Download Model: Access the model files via the Hugging Face repository or Meta's download page.
-
Load Model: Use the Transformers library to load the model for text generation tasks.
-
Inference: Run text generation tasks locally using the loaded model. For optimal performance, consider using cloud GPUs like those from AWS or Google Cloud.
License
Llama 3.1 is distributed under the Llama 3.1 Community License, granting rights for use, reproduction, and modification of the model and its materials. Redistributing the model requires providing a copy of the license and displaying "Built with Llama" prominently. The license outlines compliance with applicable laws and the model's Acceptable Use Policy. For organizations with over 700 million monthly active users, a separate license from Meta is required.