zephyr orpo 141b A35b v0.1

HuggingFaceH4

Introduction

Zephyr 141B-A39B is a language model belonging to the Zephyr series, designed as a helpful assistant. It is fine-tuned from the mistral-community/Mixtral-8x22B-v0.1 model, using the Odds Ratio Preference Optimization (ORPO) algorithm. This model was developed collaboratively by Argilla, KAIST, and Hugging Face.

Architecture

Zephyr 141B-A39B is a Mixture of Experts (MoE) model with 141 billion total parameters and 39 billion active parameters. It is primarily designed for English language tasks and fine-tuned on a mix of synthetic datasets. The model is licensed under Apache 2.0.

Training

The model was trained using the argilla/distilabel-capybara-dpo-7k-binarized dataset, consisting of high-quality, multi-turn preferences scored by LLMs. Training utilized 4 nodes of 8 x H100 GPUs for 1.3 hours. The training procedure employed a learning rate of 5e-06, an Adam optimizer, and a distributed multi-GPU setup across 32 devices.

Guide: Running Locally

To run Zephyr 141B-A39B locally, you will need to install the necessary packages and set up the environment:

  1. Install Packages

    pip install 'transformers>=4.39.3' accelerate
    
  2. Set Up the Environment
    Use the code snippet below to initialize the pipeline:

    import torch
    from transformers import pipeline
    
    pipe = pipeline(
        "text-generation",
        model="HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1",
        device_map="auto",
        torch_dtype=torch.bfloat16,
    )
    
  3. Run the Model
    Pass messages to the pipeline to generate text:

    messages = [
        {"role": "system", "content": "You are Zephyr, a helpful assistant."},
        {"role": "user", "content": "Explain how Mixture of Experts work in language a child would understand."},
    ]
    outputs = pipe(
        messages,
        max_new_tokens=512,
        do_sample=True,
        temperature=0.7,
        top_k=50,
        top_p=0.95,
    )
    print(outputs[0]["generated_text"][-1]["content"])
    

Cloud GPUs: For running this model efficiently, consider using cloud services such as AWS EC2, Google Cloud's AI Platform, or Azure's Machine Learning service with GPU instances.

License

Zephyr 141B-A39B is released under the Apache 2.0 license. This allows for both personal and commercial use, modification, and distribution of the model, provided that the original license is included with any substantial portions of the software.

More Related APIs in Text Generation