Code Llama 34b Instruct hf
codellamaIntroduction
Code Llama is a suite of pretrained and fine-tuned generative text models, ranging from 7 billion to 34 billion parameters, designed for code synthesis and understanding. The 34B instruct-tuned version is available in the Hugging Face Transformers format. This model is particularly adept at code completion, infilling, and instruction following, with a specialization in Python.
Architecture
Code Llama employs an auto-regressive language model using an optimized transformer architecture. It is designed in three variants:
- Base Model: General purpose for code synthesis and understanding.
- Python Model: Specifically optimized for Python programming.
- Instruct Model: Tailored for instruction following and safer deployment.
The model is available in sizes of 7B, 13B, and 34B parameters, with training conducted between January and July 2023.
Training
Training was executed using Meta’s Research Super Cluster, with a total of 400K GPU hours on A100-80GB hardware, resulting in estimated emissions of 65.3 tCO2eq, fully offset by Meta’s sustainability program. The models were trained and fine-tuned on the same data as Llama 2, with specific adjustments on weights.
Guide: Running Locally
-
Installation
Ensure you have the necessary packages installed:pip install transformers accelerate
-
Model Usage
The model supports code completion and infilling tasks, with a focus on Python. It can be integrated into applications for code synthesis or as a code assistant. -
Hardware Recommendations
For optimal performance, especially with larger models like the 34B version, consider using cloud-based GPUs, such as NVIDIA A100 or V100, to accommodate the computation requirements.
License
Code Llama is released under a custom commercial license by Meta. Detailed licensing information is available at Meta's Llama Downloads. The use of this model is subject to Meta's Acceptable Use Policy and Licensing Agreement.