Introduction

BioBART is a generative language model designed for biomedical text processing, based on the BART architecture. It supports text-to-text generation tasks and is implemented using the PyTorch library. The model is compatible with inference endpoints and uses Safetensors for efficient storage and loading.

Architecture

BioBART builds on the BART (Bidirectional and Auto-Regressive Transformers) model, adapted specifically for the biomedical domain. It leverages the transformer architecture to handle a wide range of biomedical text tasks, utilizing pretraining techniques to enhance its understanding of domain-specific language.

Training

BioBART is pretrained on a variety of biomedical texts, optimizing its ability to generate coherent and contextually relevant text in the biomedical field. The training process is detailed in the paper "BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model" by Hongyi Yuan et al., available on arXiv.

Guide: Running Locally

  1. Environment Setup: Ensure Python and PyTorch are installed on your system.
  2. Clone the Repository: Download the model files from the Hugging Face model page.
  3. Install Dependencies: Use pip to install the transformers library.
    pip install transformers
    
  4. Load the Model: Use the transformers library to load BioBART for inference tasks.
  5. Inference Example: Use the example widget provided to test the model:
    from transformers import BartTokenizer, BartForConditionalGeneration
    
    tokenizer = BartTokenizer.from_pretrained("GanjinZero/biobart-base")
    model = BartForConditionalGeneration.from_pretrained("GanjinZero/biobart-base")
    
    input_text = "Influenza is a <mask> disease."
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids
    outputs = model.generate(input_ids)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    

Cloud GPUs: Running the model on cloud platforms with GPU support, like AWS, Google Cloud, or Azure, can significantly speed up inference and reduce local computational load.

License

BioBART is licensed under the Apache 2.0 License, allowing for both commercial use and modification, provided that any derivative works carry the same license.

More Related APIs in Text2text Generation