fine tuned codegen 6 B Verilog
shailjaIntroduction
The VeriGen model is a fine-tuned version of the CodeGen-multi-16B model, specifically trained on Verilog code datasets. It is designed to assist with Verilog code generation by providing suggestions based on supplied prompts. The model is not designed for direct command execution but can be useful when guided with partial code inputs.
Architecture
- Model Architecture: GPT-2 with multi-query attention
- Pretraining Steps: 150,000
- Pretraining Tokens: ~72 billion
- Precision: FP16
Training
- Datasets: Trained on Verilog code from GitHub and textbooks.
- Hardware: Utilized 4 Tesla A100 GPUs.
- Training Duration: 10 days.
Guide: Running Locally
- Install Dependencies:
pip install -q transformers
- Import Libraries:
import torch from transformers import AutoTokenizer, AutoModelForCausalLM
- Set Up Prompt:
prompt = "//module half adder " device = 'cuda'
- Load Model and Tokenizer:
model_name = "shailja/fine-tuned-codegen-6B-Verilog" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name).to(device)
- Generate Sample Code:
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device) sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9) print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")
Suggestion
For efficient processing, especially when working with large models, consider using cloud GPUs from providers such as AWS, Google Cloud, or Azure.
License
The model is distributed under the BigCode OpenRAIL-M v1 license. Users must read and accept the license agreement before utilizing the model, which includes compliance with specific use restrictions and sharing requirements. The full agreement can be accessed here.