Qwen2.5 0.5 B

Qwen

Introduction

Qwen2.5-0.5B is part of the Qwen series of large language models, designed with improvements in areas like coding, mathematics, instruction following, and multilingual capabilities. It supports long-contexts and can generate up to 8K tokens. Multilingual support is available for over 29 languages. This version offers a base model with 0.5 billion parameters.

Architecture

  • Type: Causal Language Models
  • Training Stage: Pretraining
  • Architecture: Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings
  • Number of Parameters: 0.49B
  • Number of Parameters (Non-Embedding): 0.36B
  • Number of Layers: 24
  • Number of Attention Heads (GQA): 14 for Q and 2 for KV
  • Context Length: Full 32,768 tokens

Training

The model has undergone pretraining and is not recommended for conversational use without further post-training, such as SFT or RLHF. It is developed using the latest version of Hugging Face's Transformers library.

Guide: Running Locally

  1. Requirements: Ensure you have the latest version of the Hugging Face Transformers library, as versions below 4.37.0 will result in a KeyError: 'qwen2'.
  2. Environment Setup: Install necessary dependencies using Python and Pip.
  3. Model Download: Use the Hugging Face Model Hub to download the Qwen2.5-0.5B model.
  4. Execution: Load the model and tokenizer using the Transformers library and run inference.
  5. Hardware Recommendation: Utilize cloud GPUs like NVIDIA V100 or A100 for efficient performance.

License

The Qwen2.5-0.5B model is licensed under the Apache 2.0 License. For more details, refer to the license file.

More Related APIs in Text Generation