Llama 3.1 8 B

meta-llama

Introduction

The Llama 3.1 is a collection of multilingual large language models (LLMs) developed by Meta, designed for text generation and optimized for multilingual dialogue. The models are available in sizes of 8B, 70B, and 405B parameters and outperform many existing open-source and closed chat models on industry benchmarks.

Architecture

Llama 3.1 is an auto-regressive language model utilizing an optimized transformer architecture. It employs supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences. The models support multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Training

Llama 3.1 was pretrained on approximately 15 trillion tokens sourced from publicly available data. Fine-tuning involved using publicly available instruction datasets and over 25 million synthetically generated examples. Training required 39.3 million GPU hours on Meta's custom-built GPU cluster.

Guide: Running Locally

Basic Steps

  1. Install Transformers: Ensure you have transformers version >= 4.43.0 by running:
    pip install --upgrade transformers
    
  2. Load the Model:
    import transformers
    import torch
    
    model_id = "meta-llama/Llama-3.1-8B"
    pipeline = transformers.pipeline(
        "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
    )
    output = pipeline("Hey how are you doing today?")
    print(output)
    
  3. Download Original Checkpoints: Use huggingface-cli to download:
    huggingface-cli download meta-llama/Llama-3.1-8B --include "original/*" --local-dir Llama-3.1-8B
    

Suggest Cloud GPUs

Utilize cloud providers such as AWS, Google Cloud Platform, or Azure, offering access to GPUs like NVIDIA's A100 or V100 for efficient model inference and training tasks.

License

The Llama 3.1 models are released under the Llama 3.1 Community License. This non-exclusive, worldwide, non-transferable, and royalty-free license allows users to use, reproduce, distribute, and modify the Llama Materials. However, the license requires that any derivative works include attribution to Meta and comply with applicable laws and the Acceptable Use Policy. If commercial usage exceeds 700 million monthly active users, a separate license from Meta is required.

More Related APIs in Text Generation