fine tuned codegen 2 B Verilog
shailjaIntroduction
VeriGen is a fine-tuned version of the CodeGen-multi-2B model, specifically tailored for Verilog, a hardware description language. It is designed to assist with Verilog code generation by leveraging a context length of 2048 and trained on a specialized dataset.
Architecture
The model employs the GPT-2 architecture with multi-query attention. It underwent 150k pretraining steps using approximately 72 billion tokens, utilizing 16-bit floating-point precision.
Training
The training process was conducted on 3 Tesla A100 GPUs over 8 days. The model was fine-tuned using Verilog code sourced from GitHub and textbooks.
Guide: Running Locally
To run the model locally:
-
Install Dependencies:
Ensure you have Python installed and runpip install transformers torch
. -
Load Model and Tokenizer:
import torch from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "shailja/CodeGen_2B_Verilog" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda')
-
Generate Code:
Use the following code snippet to generate Verilog code:prompt = "//module half adder " input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to('cuda') sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9) print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")
-
Cloud GPUs:
For optimal performance, consider using cloud services like AWS, GCP, or Azure that provide GPU instances.
License
The VeriGen model is distributed under the BigCode OpenRAIL-M v1 license. The license includes specific use restrictions and sharing requirements. Full details can be found here.