tqwendo 36b

nisten

Introduction

TQWENDO-36B is a competitive coding model designed to surpass the Qwen-Coder-32B-Instruct. It addresses previous repetition issues and integrates speculative decoding and chain-of-thought reasoning capabilities. The model merges several pre-trained language models to enhance performance.

Architecture

TQWENDO-36B is an experimental model that combines multiple pre-trained models using a passthrough merge method. The base models include:

  • Qwen/Qwen2.5-Coder-32B-Instruct
  • huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated

The merge aims to improve speculative decoding and chain-of-thought reasoning, enabling efficient text generation with reduced computational requirements.

Training

The model leverages a combination of the Qwen and huihui-ai Coder series. By merging these models, TQWENDO-36B mitigates repetition issues and enhances its capability to generate complex code and conversational outputs. The model supports speculative decoding that accelerates processing and improves reasoning.

Guide: Running Locally

  1. Setup Environment: Ensure Python and necessary libraries, such as transformers, are installed.
  2. Download the Model: Access the model files from Hugging Face: TQWENDO-36B.
  3. Load the Model: Use the transformers library to load the model and tokenizer.
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("nisten/tqwendo-36b")
    model = AutoModelForCausalLM.from_pretrained("nisten/tqwendo-36b")
    
  4. Run Inference: Input text and generate responses using the loaded model.

Cloud GPUs

For optimal performance, utilize cloud GPUs such as those offered by AWS, GCP, or Azure to handle the computational load of the 36B parameters efficiently.

License

TQWENDO-36B is released under the MIT License, allowing for broad usage and modification while ensuring attribution to the original creators.

More Related APIs in Text Generation