codeparrot small

codeparrot

Introduction

CodeParrot 🦜 is a GPT-2 model comprising 110 million parameters, designed to generate Python code. It is available as part of the Hugging Face Transformers library.

Architecture

The model is based on the GPT-2 architecture, which is well-suited for text generation tasks. It has been specifically trained to understand and generate Python code.

Training

CodeParrot was trained on the cleaned CodeParrot dataset using the following configuration:

  • Batch size: 192
  • Context size: 1024
  • Training steps: 150,000
  • Gradient accumulation: 1
  • Gradient checkpointing: False
  • Learning rate: 5e-4
  • Weight decay: 0.1
  • Warmup steps: 2000
  • Schedule: Cosine

The training process was conducted on 16 NVIDIA A100 GPUs, each with 40GB of memory, processing roughly 29 billion tokens.

Guide: Running Locally

To use CodeParrot locally:

  1. Install Transformers: Ensure you have the Hugging Face Transformers library installed:

    pip install transformers
    
  2. Load the Model:

    from transformers import AutoTokenizer, AutoModelWithLMHead
    
    tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot-small")
    model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot-small")
    
    inputs = tokenizer("def hello_world():", return_tensors="pt")
    outputs = model(**inputs)
    
  3. Or Use a Pipeline:

    from transformers import pipeline
    
    pipe = pipeline("text-generation", model="codeparrot/codeparrot-small")
    outputs = pipe("def hello_world():")
    

For efficient training and inference, using cloud GPUs such as AWS EC2 instances with NVIDIA A100s is recommended.

License

CodeParrot is released under the Apache-2.0 license.

More Related APIs in Text Generation