deberta v3 base squad2
deepsetIntroduction
The DeBERTa-V3-Base-SQuAD2 model by deepset is a fine-tuned version of Microsoft's DeBERTa-V3-Base, optimized for extractive question answering using the SQuAD 2.0 dataset. This model can handle questions with no definite answers and is designed for tasks involving question-answer pair extraction.
Architecture
- Base Model: Microsoft DeBERTa-V3-Base
- Language: English
- Task: Extractive Question Answering
- Training and Evaluation Data: SQuAD 2.0
Training
The model was trained using the following hyperparameters:
- Batch Size: 12
- Number of Epochs: 4
- Maximum Sequence Length: 512
- Learning Rate: 2e-5
- Learning Rate Schedule: Linear warmup with 20% warmup proportion
- Document Stride: 128
- Maximum Query Length: 64
Guide: Running Locally
-
Using Haystack:
- Install Haystack with the command:
pip install haystack-ai "transformers[torch,sentencepiece]"
. - Load the model and perform extractive question answering using the Haystack framework.
- Install Haystack with the command:
-
Using Transformers:
- Import the necessary classes from
transformers
. - Load the model and tokenizer with
AutoModelForQuestionAnswering
andAutoTokenizer
. - Use the
pipeline
interface for question-answering tasks.
- Import the necessary classes from
-
Hardware Recommendation: For optimal performance, especially with large datasets, consider using cloud GPUs like NVIDIA A10G.
License
This model is licensed under CC-BY-4.0, allowing sharing and adaptation with appropriate credit given.