Llama 3.1 70 B

meta-llama

Introduction

The Meta Llama 3.1 collection includes multilingual large language models aimed at supporting commercial and research use in multiple languages. These models are optimized for dialogue and outperform many available models in benchmarks.

Architecture

Llama 3.1 is an auto-regressive language model utilizing an optimized transformer architecture. It includes supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for alignment with human preferences for safety and helpfulness. The models support English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Training

Llama 3.1 models were pre-trained on approximately 15 trillion tokens of publicly available data, with fine-tuning data including instruction datasets and synthetically generated examples. Training utilized Meta's custom GPU cluster and accounted for 39.3 million GPU hours. Meta has maintained net-zero greenhouse gas emissions during training.

Guide: Running Locally

  1. Install Transformers: Ensure you have transformers >= 4.43.0 installed via pip install --upgrade transformers.

  2. Running Inference:

    import transformers
    import torch
    
    model_id = "meta-llama/Meta-Llama-3.1-70B"
    pipeline = transformers.pipeline(
        "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
    )
    print(pipeline("Hey, how are you doing today?"))
    
  3. Use Cloud GPUs: For optimal performance, consider using cloud providers offering GPUs, such as AWS, Google Cloud, or Azure.

License

The Llama 3.1 models are distributed under the Llama 3.1 Community License, which grants a non-exclusive, worldwide, non-transferable, royalty-free limited license to use, reproduce, and distribute the Llama Materials. Compliance with applicable laws and displaying attribution are required. For detailed terms, see the full license here.

More Related APIs in Text Generation