P A R M 2 Tiny Instruct 1.7 B Qw Q o1 G G U F

Pinkstack

Introduction

The PARM-2-Tiny-Instruct-1.7B-QwQ-o1-GGUF model, developed by Pinkstack, is a conversational AI model designed to generate text in English. It is fine-tuned for improved conversational capabilities and is part of the larger Parm V2 family of models.

Architecture

This model is a variant of the Smollm2 architecture created by HuggingfaceTB. It focuses on generating high-quality conversational responses with an awareness of multiple languages, although it primarily responds in English. It is optimized with various quantization options (F16, Q8, Q5, and Q4) to balance quality and computational efficiency.

Training

The model was fine-tuned using the Unsloth and Huggingface's TRL library, with data up to September 2023. The training aimed to enhance the conversational capabilities and code generation functionality of the model.

Guide: Running Locally

  1. Select Quantization: Choose the appropriate quantization level for your device:

    • F16: High memory requirement, best quality.
    • Q8: Recommended for high quality; suitable for mobile devices.
    • Q5/Q4: Lower memory requirement, good for most devices with 1-2GB VRAM.
  2. Installation: Install the necessary software packages from Hugging Face and run the model using the appropriate quantization level.

  3. Execution: Execute the model using the provided prompt format (CHATML) to initiate conversations.

  4. Cloud GPUs: For performance optimization, consider using cloud GPU services like AWS EC2, Google Cloud, or Azure.

License

The model is released under the Apache-2.0 license, allowing for both personal and commercial use with proper attribution.

More Related APIs in Text Generation