Bio Medical Multi Modal Llama 3 8 B V1
ContactDoctorIntroduction
Bio-Medical-MultiModal-Llama-3-8B-V1 is a specialized large language model designed for applications in the biomedical field. It is fine-tuned from the Llama-3-8B-Instruct model using a custom dataset with over 500,000 entries. This dataset includes both text and image data, combining synthetic and manually curated samples to ensure a comprehensive coverage of biomedical knowledge.
Architecture
- Model Name: Bio-Medical-MultiModal-Llama-3-8B-V1
- Base Model: Llama-3-8B-Instruct
- Parameter Count: 8 billion
- Dataset Composition: Custom high-quality biomedical text and image dataset
Training
Bio-Medical-MultiModal-Llama-3-8B-V1 was trained using NVIDIA H100 GPUs to handle large-scale data efficiently. The training process utilized MiniCPM for managing multimodal data. The model was evaluated rigorously to ensure robustness and reliability in real-world biomedical applications.
Training Hyperparameters
- Learning Rate: 0.0002
- Train Batch Size: 4
- Eval Batch Size: 4
- Number of Epochs: 3
- Optimizer: Adam with betas=(0.9, 0.999)
- Mixed Precision Training: Native AMP
Framework Versions
- PEFT: 0.11.0
- Transformers: 4.40.2
- PyTorch: 2.1.2
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Guide: Running Locally
To run the model locally, follow these basic steps:
-
Install Dependencies: Ensure you have the necessary libraries such as PyTorch, Transformers, and PIL for image processing.
-
Load the Model and Tokenizer:
import torch from PIL import Image from transformers import AutoModel, AutoTokenizer, BitsAndBytesConfig bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.float16, ) model = AutoModel.from_pretrained( "ContactDoctor/Bio-Medical-MultiModal-Llama-3-8B-V1", quantization_config=bnb_config, device_map="auto", torch_dtype=torch.float16, trust_remote_code=True, attn_implementation="flash_attention_2", ) tokenizer = AutoTokenizer.from_pretrained("ContactDoctor/Bio-Medical-MultiModal-Llama-3-8B-V1", trust_remote_code=True)
-
Prepare Input Data:
image = Image.open("Path to Your image").convert('RGB') question = 'Give the modality, organ, analysis, abnormalities (if any), treatment (if abnormalities are present)?' msgs = [{'role': 'user', 'content': [image, question]}]
-
Run the Model:
res = model.chat(image=image, msgs=msgs, tokenizer=tokenizer, sampling=True, temperature=0.95, stream=True) generated_text = "" for new_text in res: generated_text += new_text print(new_text, flush=True, end='')
-
Suggested Cloud GPUs: Consider using cloud services that offer NVIDIA GPUs like the H100 for efficient model execution.
License
The Bio-Medical-MultiModal-Llama-3-8B-V1 model is available under a non-commercial use license. Please review the terms and conditions before using the model.