codeparrot small
codeparrotIntroduction
CodeParrot 🦜 is a GPT-2 model comprising 110 million parameters, designed to generate Python code. It is available as part of the Hugging Face Transformers library.
Architecture
The model is based on the GPT-2 architecture, which is well-suited for text generation tasks. It has been specifically trained to understand and generate Python code.
Training
CodeParrot was trained on the cleaned CodeParrot dataset using the following configuration:
- Batch size: 192
- Context size: 1024
- Training steps: 150,000
- Gradient accumulation: 1
- Gradient checkpointing: False
- Learning rate: 5e-4
- Weight decay: 0.1
- Warmup steps: 2000
- Schedule: Cosine
The training process was conducted on 16 NVIDIA A100 GPUs, each with 40GB of memory, processing roughly 29 billion tokens.
Guide: Running Locally
To use CodeParrot locally:
-
Install Transformers: Ensure you have the Hugging Face Transformers library installed:
pip install transformers
-
Load the Model:
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot-small") model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot-small") inputs = tokenizer("def hello_world():", return_tensors="pt") outputs = model(**inputs)
-
Or Use a Pipeline:
from transformers import pipeline pipe = pipeline("text-generation", model="codeparrot/codeparrot-small") outputs = pipe("def hello_world():")
For efficient training and inference, using cloud GPUs such as AWS EC2 instances with NVIDIA A100s is recommended.
License
CodeParrot is released under the Apache-2.0 license.