medical_o1_verifier_3 B
FreedomIntelligenceIntroduction
This model is a medical verifier designed to assess the accuracy of large language models' (LLMs) outputs on medical verifiable problems. This verification process aims to improve the medical reasoning abilities of LLMs. Further details can be found in the paper and on the GitHub repository. Additionally, consider exploring HuatuoGPT-o1, a sophisticated medical LLM focused on complex medical reasoning.
Architecture
The model is based on the meta-llama/Llama-3.2-3B-Instruct
architecture and is tagged for text classification, particularly in the medical domain. It supports both English and Chinese languages and is trained on the FreedomIntelligence/medical-o1-verifiable-problem
dataset.
Training
The model uses the AutoTokenizer
and AutoModelForSequenceClassification
from the Transformers library. It is configured with automatic determination of torch_dtype
and device mapping, employing "flash_attention_2" for attention implementation, and is set up for binary classification tasks with two labels.
Guide: Running Locally
To run the model locally:
-
Install the Transformers library: Ensure you have the
transformers
library and PyTorch installed.pip install transformers torch
-
Load the model and tokenizer: Use the following Python code to load the model and tokenizer.
from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch.nn.functional as F model_path = 'FreedomIntelligence/medical_o1_verifier_3B' tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained( model_path, torch_dtype="auto", device_map="auto", attn_implementation="flash_attention_2", num_labels=2 )
-
Prepare input data: Format your model response and reference answer using the provided evaluation template.
-
Tokenize and evaluate: Tokenize the inputs, pass them through the model, and interpret the results.
LLM_response = 'The answer is 25 percentage' ground_truth_answer = '25%' input_batch = tokenizer([template.format(LLM_response, ground_truth_answer, tokenizer.eos_token)], return_tensors="pt").to(model.device) logits = model(**input_batch, return_dict=True).logits probabilities = F.softmax(logits, dim=-1) result = "True" if probabilities[0, 1] > 0.5 else "False" print(f"Evaluation Result: {result}")
For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.
License
This model is licensed under the Apache License 2.0.