Llama 3.1 8 B Ultra Medical

TsinghuaC3I

Introduction

Llama-3.1-8B-UltraMedical is a large language model (LLM) specialized in biomedicine, developed by the Tsinghua C3I Lab. It is designed to improve medical examination accessibility, literature comprehension, and clinical knowledge. The model builds on Meta's Llama-3.1-8B foundation and is available as an open-access resource.

Architecture

Llama-3.1-8B-UltraMedical is based on the Meta-Llama-3.1-8B-Instruct model. It leverages a large-scale, high-quality dataset known as the UltraMedical collection, which consists of 410,000 synthetic and manually curated biomedical instruction samples and over 100,000 preference data points.

Training

The model is trained using supervised fine-tuning (SFT) and iterative preference learning methods, such as DPO and KTO. The UltraMedical dataset enhances the model's specialization in biomedicine, contributing to its superior performance in medical-related tasks.

Guide: Running Locally

To run Llama-3.1-8B-UltraMedical locally, follow these steps:

  1. Install Dependencies: Ensure you have Python and the transformers and vllm libraries installed.
  2. Load the Model: Use the AutoTokenizer and LLM classes to load the model and tokenizer.
  3. Set Sampling Parameters: Define parameters such as temperature, top_p, and max_tokens for model inference.
  4. Generate Outputs: Input your prompts and generate responses using the model.

For optimal performance, consider utilizing cloud GPUs from providers such as AWS, Google Cloud, or Azure.

License

Llama-3.1-8B-UltraMedical is distributed under the llama3 license. Please refer to the specific licensing terms for usage details.

More Related APIs