chatglm3 6b 32k
THUDMIntroduction
ChatGLM3-6B-32K is an enhanced version of ChatGLM3-6B, designed to better understand and handle long texts up to 32K in length. It features updated position encoding and a specialized long-text training method. If your context length requirement is within 8K, ChatGLM3-6B is recommended, but for over 8K, ChatGLM3-6B-32K is preferred. ChatGLM3-6B retains the excellent features of previous models, with enhancements including a more powerful base model, comprehensive function support, and a broader open-source series.
Architecture
The architecture of ChatGLM3-6B-32K includes improvements in the base model, which uses a diverse training dataset and sufficient training steps, showing strong performance across various datasets. It also supports newly designed prompt formats, function calls, code execution, and agent tasks. The series includes open-source models for academic research and free commercial use after registration.
Training
ChatGLM3-6B-32K was trained with an updated position encoding and a long text training method, specifically designed for handling long dialogue contexts. Evaluations on various datasets demonstrate its superior performance among models with less than 10B parameters.
Guide: Running Locally
To run ChatGLM3-6B locally, follow these steps:
- Install dependencies:
pip install protobuf transformers==4.30.2 cpm_kernels torch>=2.0 gradio mdtex2html sentencepiece accelerate
- Code usage example:
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm3-6b-32k", trust_remote_code=True) model = AutoModel.from_pretrained("THUDM/chatglm3-6b-32k", trust_remote_code=True).half().cuda() model = model.eval() response, history = model.chat(tokenizer, "你好", history=[]) print(response)
- Suggested hardware:
- For optimal performance, using cloud GPUs such as NVIDIA Tesla V100 or A100 is recommended.
Additional instructions, including running CLI and web demos or using model quantization to save memory, are available in the Github Repo.
License
The code is open-sourced under the Apache-2.0 license. The usage of ChatGLM3-6B model weights must comply with the Model License.