medical_o1_verifier_3 B

FreedomIntelligence

Introduction

This model is a medical verifier designed to assess the accuracy of large language models' (LLMs) outputs on medical verifiable problems. This verification process aims to improve the medical reasoning abilities of LLMs. Further details can be found in the paper and on the GitHub repository. Additionally, consider exploring HuatuoGPT-o1, a sophisticated medical LLM focused on complex medical reasoning.

Architecture

The model is based on the meta-llama/Llama-3.2-3B-Instruct architecture and is tagged for text classification, particularly in the medical domain. It supports both English and Chinese languages and is trained on the FreedomIntelligence/medical-o1-verifiable-problem dataset.

Training

The model uses the AutoTokenizer and AutoModelForSequenceClassification from the Transformers library. It is configured with automatic determination of torch_dtype and device mapping, employing "flash_attention_2" for attention implementation, and is set up for binary classification tasks with two labels.

Guide: Running Locally

To run the model locally:

  1. Install the Transformers library: Ensure you have the transformers library and PyTorch installed.

    pip install transformers torch
    
  2. Load the model and tokenizer: Use the following Python code to load the model and tokenizer.

    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    import torch.nn.functional as F
    
    model_path = 'FreedomIntelligence/medical_o1_verifier_3B'
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForSequenceClassification.from_pretrained(
        model_path, torch_dtype="auto", device_map="auto", attn_implementation="flash_attention_2", num_labels=2
    )
    
  3. Prepare input data: Format your model response and reference answer using the provided evaluation template.

  4. Tokenize and evaluate: Tokenize the inputs, pass them through the model, and interpret the results.

    LLM_response = 'The answer is 25 percentage'
    ground_truth_answer = '25%'
    input_batch = tokenizer([template.format(LLM_response, ground_truth_answer, tokenizer.eos_token)], return_tensors="pt").to(model.device)
    logits = model(**input_batch, return_dict=True).logits
    probabilities = F.softmax(logits, dim=-1)
    result = "True" if probabilities[0, 1] > 0.5 else "False"
    print(f"Evaluation Result: {result}")
    

For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

This model is licensed under the Apache License 2.0.

More Related APIs in Text Classification