german qg t5 quad

dehio

Introduction

GERMAN-QG-T5-QUAD is a model designed for generating questions in German. It is a fine-tuned version of the valhalla/t5-base-qg-hl model, tailored specifically for the GermanQUAD dataset. The model requires the expected answer to be highlighted using a <hl> token.

Architecture

The model is based on the T5 architecture and is optimized for question generation tasks. It utilizes PyTorch as the underlying framework and was fine-tuned using the GermanQUAD dataset.

Training

The model was trained with the following hyperparameters:

  • Learning Rate: 0.0001
  • Train Batch Size: 2
  • Eval Batch Size: 2
  • Seed: 100
  • Gradient Accumulation Steps: 8
  • Total Train Batch Size: 16
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • Learning Rate Scheduler Type: Linear
  • Number of Epochs: 10

The model achieved a BLEU-4 score of 11.30 on the GermanQuAD test set (n=2204). The training script is available on GitHub.

Guide: Running Locally

To run the GERMAN-QG-T5-QUAD model locally, follow these steps:

  1. Install Dependencies: Ensure you have the necessary libraries installed.

    pip install transformers==4.13.0.dev0 torch==1.10.0+cu102 datasets==1.16.1 tokenizers==0.10.3
    
  2. Load the Model: Use the Transformers library to load the model.

    from transformers import T5Tokenizer, T5ForConditionalGeneration
    
    tokenizer = T5Tokenizer.from_pretrained("dehio/german-qg-t5-quad")
    model = T5ForConditionalGeneration.from_pretrained("dehio/german-qg-t5-quad")
    
  3. Run Inference: Prepare your input and generate questions.

    input_text = "generate question: Obwohl die Vereinigten Staaten wie auch viele Staaten des Commonwealth Erben des britischen Common Laws sind, setzt sich das amerikanische Recht bedeutend davon ab. <hl>"
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids
    output = model.generate(input_ids)
    
    print(tokenizer.decode(output[0], skip_special_tokens=True))
    

For optimal performance, consider using a cloud GPU service like AWS, Google Cloud, or Azure.

License

The GERMAN-QG-T5-QUAD model is licensed under the MIT License.

More Related APIs in Text2text Generation