gpt2 chinese poem LLM Model

Introduction

The GPT2-Chinese-Poem model, developed by UER, is designed for generating Chinese ancient poetry. It leverages the UER-py and TencentPretrain frameworks to support large-scale pre-trained models and multimodal pre-training.

Architecture

The model is based on the GPT-2 architecture, utilizing a transformer model for text generation. It is fine-tuned to produce outputs in the style of traditional Chinese poetry.

Training

The model was trained using 800,000 Chinese ancient poems sourced from the chinese-poetry and Poetry projects. It underwent pre-training on Tencent Cloud with 200,000 steps, using a sequence length of 128. An extended vocabulary was created to include frequently occurring Chinese characters. The final model was converted to Hugging Face's format for broader accessibility.

Guide: Running Locally

To run the model locally:

Install the required libraries:
```
pip install transformers
```

Use the provided code to generate text:

from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline

tokenizer = BertTokenizer.from_pretrained("uer/gpt2-chinese-poem")
model = GPT2LMHeadModel.from_pretrained("uer/gpt2-chinese-poem")
text_generator = TextGenerationPipeline(model, tokenizer)

result = text_generator("[CLS]梅 山 如 积 翠 ，", max_length=50, do_sample=True)
print(result)

For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

The model and its components are subject to the licensing terms detailed in their respective repositories. Users should review these terms to ensure compliance with the licensing agreements.

More Related APIs in Text Generation