gpt2 medium
openai-communityIntroduction
GPT-2 Medium is a 355M parameter version of the GPT-2 model developed by OpenAI. It is a transformer-based language model designed for text generation in English, utilizing a causal language modeling objective. The model is pretrained on a large English corpus, enabling it to perform a variety of language tasks without additional training.
Architecture
GPT-2 Medium employs a transformer-based architecture consisting of 355 million parameters. It uses a byte-level Byte Pair Encoding (BPE) tokenizer with a vocabulary size of 50,257 tokens. Input sequences are processed in lengths of 1024 consecutive tokens. The model generates text by predicting the next word in a sequence, using previously generated tokens as context.
Training
The model is trained using WebText, a dataset created by scraping web pages linked from Reddit posts with at least 3 karma, excluding Wikipedia. The training process is self-supervised, meaning the model learns to predict the next word in a sequence without labeled data. The training objective ensures the model learns an internal representation of the English language. The model's performance is evaluated on various language benchmarks.
Guide: Running Locally
To run GPT-2 Medium locally, follow these steps:
-
Install the
transformers
library:pip install transformers
-
Load the model and tokenizer for text generation in PyTorch:
from transformers import pipeline, set_seed generator = pipeline('text-generation', model='gpt2-medium') set_seed(42) result = generator("Hello, I'm a language model,", max_length=30, num_return_sequences=5) print(result)
-
For TensorFlow, use:
from transformers import GPT2Tokenizer, TFGPT2Model tokenizer = GPT2Tokenizer.from_pretrained('gpt2-medium') model = TFGPT2Model.from_pretrained('gpt2-medium')
-
Consider using cloud GPUs for better performance, such as AWS, Google Cloud, or Azure, to handle the computational demands of running the model efficiently.
License
GPT-2 Medium is released under a modified MIT License. This allows for broad usage with certain restrictions noted in the license. For full license details, visit the OpenAI GitHub repository.