Llama 3.1 8 B Ultra Medical
TsinghuaC3IIntroduction
Llama-3.1-8B-UltraMedical is a large language model (LLM) specialized in biomedicine, developed by the Tsinghua C3I Lab. It is designed to improve medical examination accessibility, literature comprehension, and clinical knowledge. The model builds on Meta's Llama-3.1-8B foundation and is available as an open-access resource.
Architecture
Llama-3.1-8B-UltraMedical is based on the Meta-Llama-3.1-8B-Instruct model. It leverages a large-scale, high-quality dataset known as the UltraMedical collection, which consists of 410,000 synthetic and manually curated biomedical instruction samples and over 100,000 preference data points.
Training
The model is trained using supervised fine-tuning (SFT) and iterative preference learning methods, such as DPO and KTO. The UltraMedical dataset enhances the model's specialization in biomedicine, contributing to its superior performance in medical-related tasks.
Guide: Running Locally
To run Llama-3.1-8B-UltraMedical locally, follow these steps:
- Install Dependencies: Ensure you have Python and the
transformers
andvllm
libraries installed. - Load the Model: Use the
AutoTokenizer
andLLM
classes to load the model and tokenizer. - Set Sampling Parameters: Define parameters such as temperature, top_p, and max_tokens for model inference.
- Generate Outputs: Input your prompts and generate responses using the model.
For optimal performance, consider utilizing cloud GPUs from providers such as AWS, Google Cloud, or Azure.
License
Llama-3.1-8B-UltraMedical is distributed under the llama3 license. Please refer to the specific licensing terms for usage details.