chatglm 6b

THUDM

Introduction

ChatGLM-6B is an open-source, bilingual conversational language model based on the General Language Model (GLM) framework, featuring 6.2 billion parameters. It is optimized for Chinese and English question-answering and dialogue, utilizing techniques similar to ChatGPT. The model has been trained on approximately 1 trillion tokens and has undergone supervised fine-tuning and reinforcement learning with human feedback. ChatGLM-6B is designed for local deployment on consumer-grade GPUs, requiring only 6GB of memory with INT4 quantization. The model's weights are open for academic research, and commercial use is allowed after registration.

Architecture

ChatGLM-6B is based on the General Language Model (GLM) architecture with 6.2 billion parameters. It supports INT4 quantization, enabling deployment on consumer-level GPUs. This model has been fine-tuned for bilingual dialogue, with optimizations targeting Chinese question-answering and conversation scenarios.

Training

The model was trained on approximately 1 trillion tokens of Chinese and English text. It utilized supervised fine-tuning, feedback bootstrap, and reinforcement learning with human feedback to enhance its performance. This approach allows ChatGLM-6B to produce responses aligned with human preferences.

Guide: Running Locally

  1. Install Dependencies:

    pip install protobuf==3.20.0 transformers==4.27.1 icetk cpm_kernels
    
  2. Load and Run Model:

    from transformers import AutoTokenizer, AutoModel
    tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
    model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
    response, history = model.chat(tokenizer, "你好", history=[])
    print(response)
    
  3. Hardware Requirements:

    • Minimum of 6GB GPU memory with INT4 quantization for local deployment.
    • For enhanced performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

The code within this repository is open-sourced under the Apache-2.0 License. Usage of the ChatGLM-6B model weights must adhere to the Model License.

More Related APIs