Meta Llama 3 70 B

meta-llama

Introduction

Meta-Llama 3 is a family of large language models (LLMs) developed by Meta, offering text generation capabilities optimized for dialogue use cases. These models are available in sizes of 8 billion and 70 billion parameters, both in pre-trained and instruction-tuned variants. Meta-Llama 3 models are designed to outperform open-source chat models on industry benchmarks, focusing on helpfulness and safety.

Architecture

Meta-Llama 3 models are auto-regressive language models using an optimized transformer architecture. They employ supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences. Both the 8B and 70B parameter versions utilize Grouped-Query Attention (GQA) to improve inference scalability.

Training

Meta-Llama 3 models were pretrained on over 15 trillion tokens from publicly available online data, with a cutoff of March 2023 for 8B and December 2023 for 70B models. Fine-tuning utilized publicly available instruction datasets and over 10 million human-annotated examples, ensuring no use of Meta user data. The cumulative training involved 7.7 million GPU hours, with a total carbon footprint of 2290 tCO2eq, fully offset by Meta's sustainability program.

Guide: Running Locally

  1. Setup Environment: Install the necessary libraries such as transformers and torch.
  2. Download Model: Use the Hugging Face CLI to download the Meta-Llama-3-70B model.
    huggingface-cli download meta-llama/Meta-Llama-3-70B --include "original/*" --local-dir Meta-Llama-3-70B
    
  3. Run Model: Utilize the transformers library to create a text generation pipeline.
    import transformers
    import torch
    
    model_id = "meta-llama/Meta-Llama-3-70B"
    pipeline = transformers.pipeline(
        "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
    )
    response = pipeline("Hey how are you doing today?")
    
  4. Consider Cloud GPUs: For optimal performance, consider using cloud GPUs such as NVIDIA H100-80GB.

License

Meta-Llama 3 is distributed under a custom community license, granting non-exclusive, worldwide rights for use, reproduction, and distribution. Redistribution requires attribution and compliance with the Acceptable Use Policy. Commercial use is subject to additional terms if the product exceeds 700 million monthly active users. The full license details are available at Meta's Llama 3 License page.

More Related APIs in Text Generation