Llama3 Open Bio L L M 70 B

aaditya

Introduction

OpenBioLLM-70B is an advanced open-source biomedical language model developed by Saama AI Labs. It is designed to handle a variety of tasks in the medical and life sciences domains. The model specializes in understanding and generating text with high domain-specific accuracy across a broad range of biomedical applications.

Architecture

OpenBioLLM-70B is based on the Meta-Llama-3-70B architecture, featuring 70 billion parameters. It utilizes cutting-edge techniques, including Direct Preference Optimization (DPO) and a custom medical instruction dataset. The model outperforms other open-source models, including larger ones, on biomedical benchmarks.

Training

OpenBioLLM-70B was trained using a powerful infrastructure, including 8 H100 80GB GPUs and the adamw_bnb_8bit optimizer. Key training hyperparameters include a learning rate of 0.0002, a cosine learning rate scheduler, and a batch size of 12 for training. The model employs advanced training techniques such as the qlora adapter for parameter-efficient fine-tuning.

Guide: Running Locally

  1. Install Dependencies: Ensure you have transformers, torch, and other required packages installed.

  2. Load Model: Use the transformers library to load the model with the following code snippet:

    import transformers
    import torch
    
    model_id = "aaditya/OpenBioLLM-Llama3-70B"
    
    pipeline = transformers.pipeline(
        "text-generation",
        model=model_id,
        model_kwargs={"torch_dtype": torch.bfloat16},
        device="auto",
    )
    
  3. Run Inference: Prepare input messages and generate outputs using the pipeline.

  4. Hardware Requirements: For optimal performance, use a cloud GPU service like AWS or Google Cloud with access to powerful GPUs such as the NVIDIA A100 or H100.

License

OpenBioLLM-70B is released under the Meta-Llama License. It is intended for research and development purposes, with a strong advisory against its use in direct patient care or clinical decision support without further validation.

More Related APIs in Text Generation