rugpt3large_neuro_chgk

mary905el

Introduction

The RuGPT3Large_Neuro_CHGK is a language model developed for text generation in Russian. It is based on the GPT-2 architecture and has been fine-tuned specifically on the CHGK (Что? Где? Когда?) question dataset.

Architecture

The model is built on the GPT-2 architecture and utilizes the PyTorch library along with the Transformers framework. It supports text generation tasks and is compatible with inference endpoints.

Training

The RuGPT3Large_Neuro_CHGK was trained using a dataset of 75,000 questions from the CHGK series, covering the years 2000 to 2019. The model underwent 5 epochs of training, fine-tuning the pre-existing RuGPT3Large model.

Guide: Running Locally

To run the RuGPT3Large_Neuro_CHGK model locally, follow these steps:

  1. Install Dependencies: Ensure you have Python, PyTorch, and Transformers installed.
  2. Clone the Repository: Download the model's repository from Hugging Face.
  3. Load the Model: Use the Transformers library to load the model and tokenizer.
  4. Generate Text: Set parameters like max_length, do_sample, temperature, and no_repeat_ngram_size to generate text.

For better performance, especially with large models, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

This model card does not explicitly mention a license; check the original repository or Hugging Face page for license details.

More Related APIs in Text Generation