roberta base chinese extractive qa

uer

Introduction

The ROBERTA-BASE-CHINESE-EXTRACTIVE-QA model is designed for extractive question answering tasks. It is fine-tuned using the UER-py framework and can also be fine-tuned using TencentPretrain, enabling support for models with over one billion parameters and extending to multimodal pre-training frameworks.

Architecture

This model is a Chinese variant of the RoBERTa-base architecture, specifically adapted for question-answering in the Chinese language. It leverages the robust capabilities of the RoBERTa architecture, optimized for performance in extractive question-answering tasks.

Training

The model is trained on datasets from three primary sources: CMRC2018, WebQA, and Laisi. Only the training sets from these datasets are utilized. Fine-tuning is performed over three epochs with a sequence length of 512, starting from a pre-trained chinese_roberta_L-12_H-768 model. The training process is conducted using UER-py on Tencent Cloud, and the model is saved at the end of each epoch if it demonstrates the best performance on the development set.

Guide: Running Locally

To use the model locally, follow these steps:

  1. Install Hugging Face Transformers:

    pip install transformers
    
  2. Load the model and tokenizer:

    from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
    model = AutoModelForQuestionAnswering.from_pretrained('uer/roberta-base-chinese-extractive-qa')
    tokenizer = AutoTokenizer.from_pretrained('uer/roberta-base-chinese-extractive-qa')
    
  3. Create a question-answering pipeline:

    QA = pipeline('question-answering', model=model, tokenizer=tokenizer)
    
  4. Run a sample query:

    QA_input = {
        'question': "著名诗歌《假如生活欺骗了你》的作者是",
        'context': "普希金从那里学习人民的语言,吸取了许多有益的养料,这一切对普希金后来的创作产生了很大的影响。..."
    }
    QA(QA_input)
    

For enhanced performance, especially with large models, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

Refer to the Hugging Face Model Hub or the UER-py repository for specific licensing information related to the use and distribution of the model and related codebases.

More Related APIs in Question Answering