Neverending Story Q8_0 G G U F
AleteianIntroduction
The NeverendingStory-Q8_0-GGUF model is a GGUF format conversion of the original Aleteian/NeverendingStory model. This conversion was accomplished using llama.cpp
via the ggml.ai's GGUF-my-repo
space on Hugging Face. For additional details on the original model, refer to its model card.
Architecture
The model leverages the transformers
library and includes tags such as mergekit
, merge
, llama-cpp
, and gguf-my-repo
. It utilizes llama.cpp
for model inference, supporting both CLI and server-based operations.
Training
The specific details regarding the training process of the model are not provided in this documentation. Users are encouraged to refer to the original model card for comprehensive information on the training methodology.
Guide: Running Locally
Installing llama.cpp
- Install
llama.cpp
using Homebrew (compatible with Mac and Linux):brew install llama.cpp
Running the CLI
- Execute the CLI with the following command:
llama-cli --hf-repo Aleteian/NeverendingStory-Q8_0-GGUF --hf-file neverendingstory-q8_0.gguf -p "The meaning to life and the universe is"
Running the Server
- Start the server with:
llama-server --hf-repo Aleteian/NeverendingStory-Q8_0-GGUF --hf-file neverendingstory-q8_0.gguf -c 2048
Additional Steps
-
Clone the
llama.cpp
repository from GitHub:git clone https://github.com/ggerganov/llama.cpp
-
Navigate into the
llama.cpp
directory and build it using theLLAMA_CURL=1
flag, along with hardware-specific flags if necessary (e.g.,LLAMA_CUDA=1
for Nvidia GPUs on Linux):cd llama.cpp && LLAMA_CURL=1 make
-
Run inference using the compiled binaries:
./llama-cli --hf-repo Aleteian/NeverendingStory-Q8_0-GGUF --hf-file neverendingstory-q8_0.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo Aleteian/NeverendingStory-Q8_0-GGUF --hf-file neverendingstory-q8_0.gguf -c 2048
Cloud GPUs
For enhanced performance, consider using cloud GPUs such as AWS, Google Cloud, or Azure to run llama.cpp
.
License
The licensing details for the NeverendingStory-Q8_0-GGUF model are not explicitly provided in this summary. Users should review the original model card and associated repositories for any licensing information.