Llama3 Open Bio L L M 70 B
aadityaIntroduction
OpenBioLLM-70B is an advanced open-source biomedical language model developed by Saama AI Labs. It is designed to handle a variety of tasks in the medical and life sciences domains. The model specializes in understanding and generating text with high domain-specific accuracy across a broad range of biomedical applications.
Architecture
OpenBioLLM-70B is based on the Meta-Llama-3-70B architecture, featuring 70 billion parameters. It utilizes cutting-edge techniques, including Direct Preference Optimization (DPO) and a custom medical instruction dataset. The model outperforms other open-source models, including larger ones, on biomedical benchmarks.
Training
OpenBioLLM-70B was trained using a powerful infrastructure, including 8 H100 80GB GPUs and the adamw_bnb_8bit
optimizer. Key training hyperparameters include a learning rate of 0.0002, a cosine learning rate scheduler, and a batch size of 12 for training. The model employs advanced training techniques such as the qlora adapter for parameter-efficient fine-tuning.
Guide: Running Locally
-
Install Dependencies: Ensure you have
transformers
,torch
, and other required packages installed. -
Load Model: Use the
transformers
library to load the model with the following code snippet:import transformers import torch model_id = "aaditya/OpenBioLLM-Llama3-70B" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device="auto", )
-
Run Inference: Prepare input messages and generate outputs using the pipeline.
-
Hardware Requirements: For optimal performance, use a cloud GPU service like AWS or Google Cloud with access to powerful GPUs such as the NVIDIA A100 or H100.
License
OpenBioLLM-70B is released under the Meta-Llama License. It is intended for research and development purposes, with a strong advisory against its use in direct patient care or clinical decision support without further validation.