tokyotech llm Llama 3.1 Swallow 70 B Instruct v0.3 gguf
mmngaIntroduction
The TOKYOTECH-LLM-LLAMA-3.1-SWALLOW-70B-INSTRUCT-V0.3-GGUF is a large language model developed by tokyotech-llm. It is a version of the Llama-3.1 model adapted to the GGUF format. This model is designed for instruction-based tasks and supports English and Japanese languages. It uses the TFMC/imatrix dataset for training.
Architecture
The model is based on Llama-3.1 architecture with a 70 billion parameter configuration. It has been optimized for instruction-following tasks and is available in the GGUF format. The model leverages the iMatrix dataset, which is specifically curated for Japanese language model training.
Training
The model was trained on the TFMC/imatrix dataset, designed for enhancing Japanese language processing capabilities. This dataset provides a comprehensive collection of data suitable for fine-tuning large language models.
Guide: Running Locally
To run the model locally, follow these steps:
-
Clone the repository:
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp
-
Build the project with CUDA support:
cmake -B build -DGGML_CUDA=ON cmake --build build --config Release
-
Run the model using the following command:
build/bin/llama-cli -m 'tokyotech-llm-Llama-3.1-Swallow-70B-Instruct-v0.3-Q4_0.gguf' -n 128 -c 128 -p 'あなたはプロの料理人です。レシピを教えて' -cnv
For efficient execution, consider using cloud platforms offering GPU instances, such as AWS, Google Cloud, or Azure.
License
The model is released under the llama3.1 license. Users should review the license terms for compliance and usage guidelines.