Qwen2.5 Coder 14 B Instruct G G U F LLM Model

Introduction

Qwen2.5-Coder is part of the Code-Specific Qwen large language models, designed for code generation, reasoning, and fixing. It improves upon previous versions with 5.5 trillion training tokens and supports real-world applications like Code Agents. Notable features include long-context support up to 128K tokens and compatibility with various model sizes.

Architecture

The Qwen2.5-Coder 14B model is a Causal Language Model built using transformers, RoPE, SwiGLU, RMSNorm, and Attention QKV bias. It has 14.7 billion parameters, with 13.1 billion non-embedding parameters. The model consists of 48 layers and 48 attention heads (40 for Q and 8 for KV). It supports a full context length of 32,768 tokens and offers multiple quantization options, including q2_K and q8_0.

Training

Qwen2.5-Coder was trained using a massive dataset that includes source code, text-code grounding, and synthetic data, among others. The training process involves both pretraining and post-training stages to enhance its code-related capabilities and general competencies.

Guide: Running Locally

Installation
Install the required package:
```
pip install -U huggingface_hub
```

Download Files
Use the following command to download the necessary GGUF files:

huggingface-cli download Qwen/Qwen2.5-Coder-14B-Instruct-GGUF --include "qwen2.5-coder-14b-instruct-q5_k_m*.gguf" --local-dir . --local-dir-use-symlinks False

Merge Files
(If needed) Merge split files using:

./llama-gguf-split --merge qwen2.5-coder-14b-instruct-q5_k_m-00001-of-00002.gguf qwen2.5-coder-14b-instruct-q5_k_m.gguf

Run Model
Start the model in conversation mode:

./llama-cli -m <gguf-file-path> \
    -co -cnv -p "You are Qwen, created by Alibaba Cloud. You are a helpful assistant." \
    -fa -ngl 80 -n 512

For better performance, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure.

License

The Qwen2.5-Coder model is released under the Apache 2.0 license. More details can be found here.

More Related APIs in Text Generation