roberta base squad2
deepsetIntroduction
The roberta-base-squad2
model by deepset is a fine-tuned version of the roberta-base
model, optimized for the task of Extractive Question Answering using the SQuAD 2.0 dataset. It is specifically trained to handle both answerable and unanswerable questions.
Architecture
- Base Model: FacebookAI/roberta-base
- Language: English
- Task: Extractive Question Answering
- Training Data: SQuAD 2.0
- Evaluation Data: SQuAD 2.0
The model architecture is based on the RoBERTa architecture, which is a robustly optimized BERT pretraining approach.
Training
Training was performed on 4x Tesla V100 GPUs with the following hyperparameters:
- Batch Size: 96
- Epochs: 2
- Max Sequence Length: 386
- Learning Rate: 3e-5
- Learning Rate Schedule: LinearWarmup
- Warmup Proportion: 0.2
- Document Stride: 128
- Max Query Length: 64
Guide: Running Locally
Using Haystack
- Install Haystack and Transformers:
pip install haystack-ai "transformers[torch,sentencepiece]"
- Load the model in Haystack:
from haystack import Document from haystack.components.readers import ExtractiveReader docs = [Document(content="Python is a popular programming language")] reader = ExtractiveReader(model="deepset/roberta-base-squad2") reader.warm_up() question = "What is a popular programming language?" result = reader.run(query=question, documents=docs)
Using Transformers
- Load the model and tokenizer via Transformers:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline model_name = "deepset/roberta-base-squad2" nlp = pipeline('question-answering', model=model_name, tokenizer=model_name) QA_input = { 'question': 'Why is model conversion important?', 'context': 'The option to convert models between FARM and transformers gives freedom to the user and lets people easily switch between frameworks.' } res = nlp(QA_input)
Suggested Cloud GPUs
- Tesla V100
- NVIDIA A100
License
The roberta-base-squad2
model is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).