rugpt3large_neuro_chgk
mary905elIntroduction
The RuGPT3Large_Neuro_CHGK is a language model developed for text generation in Russian. It is based on the GPT-2 architecture and has been fine-tuned specifically on the CHGK (Что? Где? Когда?) question dataset.
Architecture
The model is built on the GPT-2 architecture and utilizes the PyTorch library along with the Transformers framework. It supports text generation tasks and is compatible with inference endpoints.
Training
The RuGPT3Large_Neuro_CHGK was trained using a dataset of 75,000 questions from the CHGK series, covering the years 2000 to 2019. The model underwent 5 epochs of training, fine-tuning the pre-existing RuGPT3Large model.
Guide: Running Locally
To run the RuGPT3Large_Neuro_CHGK model locally, follow these steps:
- Install Dependencies: Ensure you have Python, PyTorch, and Transformers installed.
- Clone the Repository: Download the model's repository from Hugging Face.
- Load the Model: Use the Transformers library to load the model and tokenizer.
- Generate Text: Set parameters like
max_length
,do_sample
,temperature
, andno_repeat_ngram_size
to generate text.
For better performance, especially with large models, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.
License
This model card does not explicitly mention a license; check the original repository or Hugging Face page for license details.