Novaeus Promptist 7 B Instruct G G U F LLM Model

Introduction

The Novaeus-Promptist-7B-Instruct is a fine-tuned large language model derived from the Qwen2.5-7B-Instruct base model. It is optimized for prompt enhancement, text generation, and instruction-following tasks, providing high-quality outputs tailored to various applications.

Architecture

The model utilizes the GGUF format, with several file variations supporting different precision levels and quantization, such as FP16 and Q4/K_M. It is designed to enhance input prompts, follow complex instructions, and adapt to specific user needs through customization and fine-tuning.

Training

Base Model: Qwen2.5-7B-Instruct
Datasets Used for Fine-Tuning:
- gokaygokay/prompt-enhancer-dataset: Focuses on prompt engineering with 17.9k samples.
- gokaygokay/prompt-enhancement-75k: Encompasses a wider array of prompt styles with 73.2k samples.
- prithivMLmods/Prompt-Enhancement-Mini: A compact dataset (1.16k samples) for iterative refinement.

Guide: Running Locally

Setup

Download Files: Ensure all necessary model files, including shards, tokenizer configurations, and index files, are downloaded and placed in the correct directory.
Load Model: Use PyTorch or Hugging Face Transformers to load the model and tokenizer. Ensure pytorch_model.bin.index.json is correctly set for efficient shard-based loading.
Customize Generation: Adjust parameters in generation_config.json to control aspects such as temperature, top-p sampling, and maximum sequence length.

Run with Ollama

Download and Install Ollama: Download from Ollama and install on your Windows or Mac system.
Create the Model File: Create and name the model file, e.g., metallama.
Add the Template Command: Include a FROM line in the model file specifying the base model file, such as FROM Llama-3.2-1B.F16.gguf.
Create and Patch the Model: Run ollama create metallama -f ./metallama in the terminal.
Run the Model: Use ollama run metallama to execute the model.

Suggested Cloud GPUs: For enhanced performance, consider using cloud-based GPU services such as AWS EC2, Google Cloud GPU offerings, or Azure's GPU instances.

License

The model is released under the CreativeML OpenRAIL-M license, allowing for open research and responsible AI development.

More Related APIs in Text Generation