tqwendo 36b
nistenIntroduction
TQWENDO-36B is a competitive coding model designed to surpass the Qwen-Coder-32B-Instruct. It addresses previous repetition issues and integrates speculative decoding and chain-of-thought reasoning capabilities. The model merges several pre-trained language models to enhance performance.
Architecture
TQWENDO-36B is an experimental model that combines multiple pre-trained models using a passthrough merge method. The base models include:
- Qwen/Qwen2.5-Coder-32B-Instruct
- huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated
The merge aims to improve speculative decoding and chain-of-thought reasoning, enabling efficient text generation with reduced computational requirements.
Training
The model leverages a combination of the Qwen and huihui-ai Coder series. By merging these models, TQWENDO-36B mitigates repetition issues and enhances its capability to generate complex code and conversational outputs. The model supports speculative decoding that accelerates processing and improves reasoning.
Guide: Running Locally
- Setup Environment: Ensure Python and necessary libraries, such as
transformers
, are installed. - Download the Model: Access the model files from Hugging Face: TQWENDO-36B.
- Load the Model: Use the
transformers
library to load the model and tokenizer.from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("nisten/tqwendo-36b") model = AutoModelForCausalLM.from_pretrained("nisten/tqwendo-36b")
- Run Inference: Input text and generate responses using the loaded model.
Cloud GPUs
For optimal performance, utilize cloud GPUs such as those offered by AWS, GCP, or Azure to handle the computational load of the 36B parameters efficiently.
License
TQWENDO-36B is released under the MIT License, allowing for broad usage and modification while ensuring attribution to the original creators.