Introduction

ChatGLM-6B is an open-source, bilingual conversational language model based on the General Language Model (GLM) architecture, featuring 6.2 billion parameters. It is optimized for Chinese and English dialogue and question-answering. Utilizing model quantization techniques, it can be deployed locally on consumer-grade GPUs with as little as 6GB of video memory at the INT4 quantization level.

Architecture

ChatGLM-6B is based on the GLM architecture and consists of 6.2 billion parameters. It uses technology similar to ChatGPT and has been optimized for Chinese Q&A and conversation tasks. The model has undergone training with approximately 1 trillion tokens in both English and Chinese. It includes processes such as supervised fine-tuning, feedback self-help, and human feedback reinforcement learning to enhance its ability to generate responses aligned with human preferences.

Training

The model was trained on approximately 1 trillion tokens in both English and Chinese. The training process includes supervised fine-tuning, feedback self-help, and reinforcement learning from human feedback, allowing the ChatGLM-6B to produce responses that align with human preferences.

Guide: Running Locally

  1. Download the Model:

    from modelscope import snapshot_download
    model_dir = snapshot_download('Genius-Society/chatglm_6b')
    
  2. Set Up Environment: Clone the repository and navigate to the directory:

    git clone git@hf.co:Genius-Society/chatglm_6b
    cd chatglm_6b
    
  3. Hardware Requirements: To run ChatGLM-6B locally, a consumer-grade GPU with at least 6GB of video memory is recommended. For better performance, consider using cloud GPU services such as AWS, GCP, or Azure.

License

ChatGLM-6B is released under the MIT License, allowing for broad usage and modification.

More Related APIs