Tinyllama 2 1b miniguanaco G G U F
TheBlokeIntroduction
The Tinyllama 2 1B MiniGuanaco model, created by Odunusi Abraham Ayoola, is available in GGUF format, a new model format introduced by the llama.cpp team. This format replaces the previously used GGML and offers compatibility with various clients and libraries.
Architecture
GGUF is supported by a variety of systems, including llama.cpp, text-generation-webui, KoboldCpp, LM Studio, LoLLMS Web UI, Faraday.dev, ctransformers, llama-cpp-python, and candle. This compatibility allows the Tinyllama model to be used across different platforms with GPU acceleration capabilities.
Training
The Tinyllama 2 1B MiniGuanaco model has been quantized using several methods to optimize performance and memory usage. The quantization options range from 2-bit to 8-bit, allowing users to select the best balance between model size and quality. These quantization techniques include GGML_TYPE_Q2_K, GGML_TYPE_Q3_K, GGML_TYPE_Q4_K, GGML_TYPE_Q5_K, and GGML_TYPE_Q6_K.
Guide: Running Locally
Basic Steps
-
Download the model: Use the
huggingface-hub
Python library to download the specific model file you need. For example:pip3 install huggingface-hub huggingface-cli download TheBloke/Tinyllama-2-1b-miniguanaco-GGUF tinyllama-2-1b-miniguanaco.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False
-
Run using llama.cpp: Ensure you have the latest version of llama.cpp (commit d0cee0d or later), and execute the model with:
./main -ngl 32 -m tinyllama-2-1b-miniguanaco.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Human: {prompt}\n### Assistant:"
-
Python integration: Use the
ctransformers
library to execute the model in Python:from ctransformers import AutoModelForCausalLM llm = AutoModelForCausalLM.from_pretrained("TheBloke/Tinyllama-2-1b-miniguanaco-GGUF", model_file="tinyllama-2-1b-miniguanaco.Q4_K_M.gguf", model_type="llama", gpu_layers=50) print(llm("AI is going to"))
Cloud GPUs
For enhanced performance, consider using cloud GPU services that support CUDA, AMD ROCm, or Metal acceleration, depending on your system.
License
The Tinyllama 2 1B MiniGuanaco model is available under an unspecified license. Users should verify and comply with any applicable terms and conditions.