Qwen2 0.5 B Instruct
QwenIntroduction
Qwen2 is a new series of large language models, featuring both base and instruction-tuned models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repository contains the instruction-tuned 0.5B Qwen2 model. It surpasses many open-source models and competes with proprietary models across various benchmarks for language understanding, generation, multilingual capability, coding, mathematics, and reasoning.
Architecture
The Qwen2 model series includes decoder language models of different sizes, consisting of base and aligned chat models. It is built on the Transformer architecture, utilizing SwiGLU activation, attention QKV bias, and group query attention. The tokenizer is improved for adaptability to multiple natural languages and codes.
Training
The models were pretrained with extensive data and underwent post-training through supervised fine-tuning and direct preference optimization.
Guide: Running Locally
- Install the required version of the
transformers
library:pip install transformers>=4.37.0
- Load the model and tokenizer in Python:
from transformers import AutoModelForCausalLM, AutoTokenizer device = "cuda" # Use a GPU if available model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen2-0.5B-Instruct", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2-0.5B-Instruct") prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512 ) response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
- For optimal performance, it is recommended to use cloud GPUs, such as those provided by AWS, Google Cloud, or Azure.
License
The Qwen2 model is licensed under the Apache License 2.0.