creative writer 32b preview G G U F

bartowski

Introduction

The Creative Writer 32B Preview is a model designed for text generation in English, focusing on creative writing tasks. It utilizes the multiplicative-lora technique and is compatible with inference endpoints via imatrix.

Architecture

The model is built on the base model jukofyork/creative-writer-32b-preview. It uses llama.cpp for quantization, specifically utilizing the imatrix method to enhance performance on various hardware setups. The model supports different quantization levels, allowing for flexibility in quality and resource requirements.

Training

The model employs quantization techniques such as Q8_0, Q6_K_L, Q5_K_M, and IQ4_NL to optimize performance across different hardware. These quantizations are designed to balance quality and computational efficiency, with certain setups recommended based on specific use cases and hardware capabilities.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Hugging Face CLI:
    Ensure the CLI is installed to manage downloads:

    pip install -U "huggingface_hub[cli]"
    
  2. Download the Model:
    Use the CLI to download model files. For example:

    huggingface-cli download bartowski/creative-writer-32b-preview-GGUF --include "creative-writer-32b-preview-Q4_K_M.gguf" --local-dir ./
    
  3. Select Appropriate Quantization:
    Choose a quantization level that fits your hardware's RAM and VRAM capabilities. Reference the file sizes and quality descriptions in the model documentation to make an informed choice.

  4. Run with Suitable Backend:
    Depending on your hardware, choose the correct backend (cuBLAS for Nvidia, rocBLAS for AMD). If utilizing CPU, ensure compatibility with your setup.

  5. Consider Cloud GPUs:
    For optimal performance, especially with larger models, consider running on cloud platforms offering GPU instances.

License

The Creative Writer 32B Preview is released under the CC-BY-NC-4.0 license, allowing for non-commercial use with attribution.

More Related APIs in Text Generation