Qwen2.5 Coder 32 B Instruct 128 K G G U F

unsloth

Introduction

Qwen2.5-Coder is a series of large language models specifically designed for code-related tasks. It features six different model sizes (ranging from 0.5 to 32 billion parameters) to cater to various developer needs. The model excels in code generation, reasoning, and fixing, utilizing 5.5 trillion training tokens, including source code and synthetic data. It aims to provide a robust foundation for real-world applications like Code Agents, while also maintaining strong competencies in mathematics and general tasks.

Architecture

The Qwen2.5-Coder-32B model is based on a transformer architecture with several enhancements:

  • Type: Causal Language Models
  • Training Stage: Pretraining & Post-training
  • Architecture: Incorporates RoPE, SwiGLU, RMSNorm, and Attention QKV bias
  • Parameters: 32.5 billion total (31.0 billion non-embedding)
  • Layers: 64
  • Attention Heads (GQA): 40 for Q and 8 for KV
  • Context Length: 131,072 tokens

Training

The Qwen2.5-Coder models are trained using a vast dataset of 5.5 trillion tokens, including a mix of source code and synthetic data. This extensive training is aimed at enhancing the model's performance in coding tasks, making it comparable to state-of-the-art models like GPT-4o.

Guide: Running Locally

To run the Qwen2.5-Coder-32B model locally, ensure you have the latest version of the Hugging Face Transformers library. Versions older than 4.37.0 may result in errors. For optimal performance, a powerful GPU is recommended. Here are the basic steps:

  1. Install Dependencies: Update to the latest Transformers library.
  2. Download the Model: Obtain the model from the Hugging Face repository.
  3. Set Up the Environment: Configure your system to handle the model's requirements.
  4. Run the Model: Use the provided Colab notebooks for a quick start.

For those without local GPU capabilities, consider using cloud GPU services like Google Colab or Kaggle, which offer free access to GPUs like the Tesla T4.

License

The Qwen2.5-Coder-32B model is released under the Apache-2.0 license, allowing for broad usage with attribution.

More Related APIs