Llama 3.2 3 B Instruct

meta-llama

Introduction

Llama 3.2 is a collection of multilingual large language models developed by Meta, optimized for various text generation tasks. It includes models in sizes 1B and 3B, designed for multilingual dialogue, retrieval, and summarization tasks, outperforming many available models.

Architecture

Llama 3.2 is an auto-regressive language model utilizing an optimized transformer architecture. It employs supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to enhance alignment with human preferences. The model supports eight languages and can be adapted for languages beyond these.

Training

Llama 3.2 was pretrained on up to 9 trillion tokens of publicly available data, integrating logits from larger Llama 3.1 models. The training process involved knowledge distillation and multiple rounds of alignment. Training utilized 916k GPU hours, resulting in an estimated 240 tons CO2eq emissions. Meta maintains net-zero greenhouse gas emissions, and training emissions are offset by renewable energy.

Guide: Running Locally

  1. Install Transformers: Ensure you have transformers >= 4.43.0 installed.

    pip install --upgrade transformers
    
  2. Set Up Environment:

    • Import necessary libraries and set up the pipeline.
    import torch
    from transformers import pipeline
    
    model_id = "meta-llama/Llama-3.2-3B-Instruct"
    pipe = pipeline(
        "text-generation",
        model=model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
    )
    
  3. Generate Text:

    • Use the pipeline to generate text based on input messages.
    messages = [
        {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
        {"role": "user", "content": "Who are you?"},
    ]
    outputs = pipe(
        messages,
        max_new_tokens=256,
    )
    print(outputs[0]["generated_text"][-1])
    
  4. Hardware Recommendations: Consider using cloud GPUs like NVIDIA A100 or V100 for optimal performance.

  5. Additional Resources: Detailed recipes for various setups are available at huggingface-llama-recipes.

License

Llama 3.2 is governed by the Llama 3.2 Community License, a custom commercial license agreement. The license grants a non-exclusive, worldwide, non-transferable, and royalty-free limited license to use, reproduce, and modify the Llama Materials, with specific conditions for redistribution and compliance with applicable laws.

More Related APIs in Text Generation