fine tuned codegen 6 B Verilog LLM Model

Introduction

The VeriGen model is a fine-tuned version of the CodeGen-multi-16B model, specifically trained on Verilog code datasets. It is designed to assist with Verilog code generation by providing suggestions based on supplied prompts. The model is not designed for direct command execution but can be useful when guided with partial code inputs.

Architecture

Model Architecture: GPT-2 with multi-query attention
Pretraining Steps: 150,000
Pretraining Tokens: ~72 billion
Precision: FP16

Training

Datasets: Trained on Verilog code from GitHub and textbooks.
Hardware: Utilized 4 Tesla A100 GPUs.
Training Duration: 10 days.

Guide: Running Locally

Install Dependencies:
```
pip install -q transformers
```

Import Libraries:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

Set Up Prompt:

prompt = "//module half adder "
device = 'cuda'

Load Model and Tokenizer:

model_name = "shailja/fine-tuned-codegen-6B-Verilog"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(device)

Generate Sample Code:

input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
sample = model.generate(input_ids, max_length=128, temperature=0.5, top_p=0.9)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"endmodule"]) + "endmodule")

Suggestion

For efficient processing, especially when working with large models, consider using cloud GPUs from providers such as AWS, Google Cloud, or Azure.

License

The model is distributed under the BigCode OpenRAIL-M v1 license. Users must read and accept the license agreement before utilizing the model, which includes compliance with specific use restrictions and sharing requirements. The full agreement can be accessed here.

More Related APIs in Text Generation