Code Qwen1.5 7 B

Qwen

Introduction

CodeQwen1.5 is a code-specific version of the Qwen1.5 language model. It is a transformer-based, decoder-only language model pretrained on a vast collection of code data. The model is designed to offer robust code generation capabilities and exhibits competitive performance across various benchmarks. It supports long context understanding with a context length of up to 64K tokens and covers 92 programming languages. CodeQwen1.5 excels in tasks such as text-to-SQL and bug fixing.

Architecture

The CodeQwen1.5 model is derived from the Qwen1.5 series, which consists of decoder language models of varying sizes. It is trained on 3 trillion tokens of code data and incorporates group query attention (GQA) for efficient inference.

Training

The model leverages the latest features of the Hugging Face Transformers library. It is recommended to use transformers version 4.37.0 or later to avoid potential issues, such as KeyError: 'qwen2'.

Guide: Running Locally

  1. Install Dependencies: Ensure you have transformers>=4.37.0 installed.
  2. Model Usage: While the base model is not intended for chat, it can be used for fine-tuning and tasks like code infilling and generation. Caution is advised regarding stopping criteria.
  3. Hardware Recommendations: Utilizing cloud GPUs, such as those provided by AWS, GCP, or Azure, is recommended for efficient model execution and training.

License

The CodeQwen1.5 model is released under the tongyi-qianwen-research license. For more details, refer to the license file.

More Related APIs in Text Generation