Llama 3.1 8 B Instruct

meta-llama

Introduction

Meta-Llama 3.1-8B-Instruct is a multilingual large language model developed by Meta. It is part of the Llama 3.1 collection, optimized for dialogue and text generation across multiple languages. The model is designed to outperform existing chat models, offering enhanced multilingual capabilities.

Architecture

Llama 3.1 utilizes an auto-regressive transformer architecture with versions tuned through supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF). This setup helps align the model with human preferences for helpfulness and safety.

Training

The model was pretrained on approximately 15 trillion tokens from publicly available sources and fine-tuned with over 25 million synthetically generated examples. It supports eight languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. Training involved extensive computational resources, accumulating 39.3 million GPU hours.

Guide: Running Locally

Basic Steps

  1. Install Transformers Library: Ensure you have transformers >= 4.43.0 installed.

    pip install --upgrade transformers
    
  2. Set Up Environment: Import necessary modules and configure the pipeline.

    import transformers
    
    model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct"
    pipeline = transformers.pipeline(
        "text-generation",
        model=model_id,
        model_kwargs={"torch_dtype": torch.bfloat16},
        device_map="auto",
    )
    
  3. Run Inference: Use the pipeline to generate responses.

    messages = [
        {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
        {"role": "user", "content": "Who are you?"},
    ]
    
    outputs = pipeline(
        messages,
        max_new_tokens=256,
    )
    print(outputs[0]["generated_text"][-1])
    
  4. Explore Further: For advanced implementations including tool use, refer to the Hugging Face Llama Recipes.

Suggested Cloud GPUs

Utilize cloud services offering GPUs such as AWS, Google Cloud, or Azure for enhanced performance and scalability.

License

The model is distributed under the Llama 3.1 Community License. This license grants a non-exclusive, worldwide, royalty-free license to use, reproduce, and distribute the Llama Materials, with specific terms for redistribution and commercial use. Detailed terms are available here.

More Related APIs in Text Generation