incoder 6 B
facebookIntroduction
The INCODER 6B is a 6 billion parameter decoder-only Transformer model designed for code generation and infilling tasks. It utilizes a causal-masked objective to handle code insertion and standard left-to-right generation. The model was trained on code from open-source repositories with permissive licenses, primarily focusing on Python and JavaScript but also includes 28 other languages.
Architecture
INCODER 6B is a decoder-only Transformer model, supporting a causal-masked objective for code generation. It is available in two versions: full-precision (float32) for fine-tuning and half-precision (float16) for inference, catering to different resource availability and use cases.
Training
The model was trained using public code repositories from platforms like GitHub and GitLab, as well as StackOverflow, under permissive licenses such as Apache 2.0, MIT, BSD-2, and BSD-3. The training process was conducted using Fairseq, and it supports code generation in multiple programming languages.
Guide: Running Locally
Requirements
- PyTorch
- Tokenizers (version 0.12.1 or higher)
- Transformers
To install the necessary dependencies, run:
pip install torch
pip install "tokenizers>=0.12.1"
pip install transformers
Model Usage
Full-precision (float32) Version:
- Suitable for fine-tuning.
- Requires substantial GPU memory, possibly multiple GPUs.
model = AutoModelForCausalLM.from_pretrained("facebook/incoder-6B")
Half-precision (float16) Version:
- Optimized for inference with reduced memory usage.
- Can run on a 16 GB GPU with batch size 1 for sequence lengths of at least 256.
model = AutoModelForCausalLM.from_pretrained("facebook/incoder-6B", revision="float16", torch_dtype=torch.float16, low_cpu_mem_usage=True)
Tokenizer
To load the tokenizer and decode outputs:
tokenizer = AutoTokenizer.from_pretrained("facebook/incoder-6B")
output = tokenizer.decode(tokenizer.encode("from ."), clean_up_tokenization_spaces=False)
Cloud GPUs
For optimal performance and to handle large models, consider using cloud-based GPU services such as AWS EC2, Google Cloud Platform, or Azure.
License
The model is released under the CC-BY-NC 4.0 license, allowing for non-commercial use with attribution.