fine tuned codegen 2 B Verilog

shailja

Introduction

VeriGen is a fine-tuned version of the CodeGen-multi-2B model, specifically tailored for Verilog, a hardware description language. It is designed to assist with Verilog code generation by leveraging a context length of 2048 and trained on a specialized dataset.

Architecture

The model employs the GPT-2 architecture with multi-query attention. It underwent 150k pretraining steps using approximately 72 billion tokens, utilizing 16-bit floating-point precision.

Training

The training process was conducted on 3 Tesla A100 GPUs over 8 days. The model was fine-tuned using Verilog code sourced from GitHub and textbooks.

Guide: Running Locally

To run the model locally:

  1. Install Dependencies:
    Ensure you have Python installed and run pip install transformers torch.

  2. Load Model and Tokenizer:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    model_name = "shailja/CodeGen_2B_Verilog"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda')
    
  3. Generate Code:
    Use the following code snippet to generate Verilog code:

    prompt = "//module half adder "
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to('cuda')
    sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9)
    print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")
    
  4. Cloud GPUs:
    For optimal performance, consider using cloud services like AWS, GCP, or Azure that provide GPU instances.

License

The VeriGen model is distributed under the BigCode OpenRAIL-M v1 license. The license includes specific use restrictions and sharing requirements. Full details can be found here.

More Related APIs in Text Generation