german qg t5 quad LLM Model

Introduction

GERMAN-QG-T5-QUAD is a model designed for generating questions in German. It is a fine-tuned version of the valhalla/t5-base-qg-hl model, tailored specifically for the GermanQUAD dataset. The model requires the expected answer to be highlighted using a <hl> token.

Architecture

The model is based on the T5 architecture and is optimized for question generation tasks. It utilizes PyTorch as the underlying framework and was fine-tuned using the GermanQUAD dataset.

Training

The model was trained with the following hyperparameters:

Learning Rate: 0.0001
Train Batch Size: 2
Eval Batch Size: 2
Seed: 100
Gradient Accumulation Steps: 8
Total Train Batch Size: 16
Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
Learning Rate Scheduler Type: Linear
Number of Epochs: 10

The model achieved a BLEU-4 score of 11.30 on the GermanQuAD test set (n=2204). The training script is available on GitHub.

Guide: Running Locally

To run the GERMAN-QG-T5-QUAD model locally, follow these steps:

Install Dependencies: Ensure you have the necessary libraries installed.

pip install transformers==4.13.0.dev0 torch==1.10.0+cu102 datasets==1.16.1 tokenizers==0.10.3

Load the Model: Use the Transformers library to load the model.

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("dehio/german-qg-t5-quad")
model = T5ForConditionalGeneration.from_pretrained("dehio/german-qg-t5-quad")

Run Inference: Prepare your input and generate questions.

input_text = "generate question: Obwohl die Vereinigten Staaten wie auch viele Staaten des Commonwealth Erben des britischen Common Laws sind, setzt sich das amerikanische Recht bedeutend davon ab. <hl>"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
output = model.generate(input_ids)

print(tokenizer.decode(output[0], skip_special_tokens=True))

For optimal performance, consider using a cloud GPU service like AWS, Google Cloud, or Azure.

License

The GERMAN-QG-T5-QUAD model is licensed under the MIT License.

More Related APIs in Text2text Generation