xlm roberta large squad2

deepset

Introduction

The deepset/xlm-roberta-large-squad2 model is a multilingual, large-scale language model designed for extractive question answering tasks. It is based on XLM-RoBERTa and trained on the SQuAD 2.0 dataset, enabling it to handle question answering across various languages.

Architecture

  • Model: XLM-RoBERTa Large
  • Task: Extractive Question Answering
  • Language Support: Multilingual
  • Training Data: SQuAD 2.0
  • Evaluation Data: SQuAD dev set, German MLQA, German XQuAD

Training

Training involves the following key hyperparameters:

  • Batch Size: 32
  • Epochs: 3
  • Learning Rate: 1e-5
  • Max Sequence Length: 256
  • Learning Rate Schedule: Linear Warmup
  • Warmup Proportion: 0.2
  • Doc Stride: 128
  • Max Query Length: 64

The training utilized four Tesla V100 GPUs.

Guide: Running Locally

To run the model locally, you can use frameworks like Haystack or Transformers.

Using Transformers

  1. Install Required Libraries:

    pip install transformers torch
    
  2. Run the Model:

    from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
    
    model_name = "deepset/xlm-roberta-large-squad2"
    nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
    
    QA_input = {
        'question': 'Why is model conversion important?',
        'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
    }
    res = nlp(QA_input)
    

Using Haystack

  1. Install Haystack:

    pip install haystack-ai "transformers[torch,sentencepiece]"
    
  2. Load the Model:

    from haystack import Document
    from haystack.components.readers import ExtractiveReader
    
    docs = [Document(content="Python is a popular programming language")]
    reader = ExtractiveReader(model="deepset/xlm-roberta-large-squad2")
    reader.warm_up()
    
    question = "What is a popular programming language?"
    result = reader.run(query=question, documents=docs)
    

Cloud GPUs

For optimal performance, consider using cloud-based GPUs such as those offered by AWS, GCP, or Azure.

License

The model is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).

More Related APIs in Question Answering