deberta v3 base squad2

deepset

Introduction

The DeBERTa-V3-Base-SQuAD2 model by deepset is a fine-tuned version of Microsoft's DeBERTa-V3-Base, optimized for extractive question answering using the SQuAD 2.0 dataset. This model can handle questions with no definite answers and is designed for tasks involving question-answer pair extraction.

Architecture

  • Base Model: Microsoft DeBERTa-V3-Base
  • Language: English
  • Task: Extractive Question Answering
  • Training and Evaluation Data: SQuAD 2.0

Training

The model was trained using the following hyperparameters:

  • Batch Size: 12
  • Number of Epochs: 4
  • Maximum Sequence Length: 512
  • Learning Rate: 2e-5
  • Learning Rate Schedule: Linear warmup with 20% warmup proportion
  • Document Stride: 128
  • Maximum Query Length: 64

Guide: Running Locally

  • Using Haystack:

    1. Install Haystack with the command: pip install haystack-ai "transformers[torch,sentencepiece]".
    2. Load the model and perform extractive question answering using the Haystack framework.
  • Using Transformers:

    1. Import the necessary classes from transformers.
    2. Load the model and tokenizer with AutoModelForQuestionAnswering and AutoTokenizer.
    3. Use the pipeline interface for question-answering tasks.
  • Hardware Recommendation: For optimal performance, especially with large datasets, consider using cloud GPUs like NVIDIA A10G.

License

This model is licensed under CC-BY-4.0, allowing sharing and adaptation with appropriate credit given.

More Related APIs in Question Answering