Wabi Sabi V1
Local-Novel-LLM-projectIntroduction
WabiSabi-V1 is a fine-tuned version of the Mistral-7B-v0.1 model, designed to enhance text generation capabilities in both English and Japanese. It features improvements such as a larger context window and the ability to maintain memory over long text sequences. This model was developed using resources from the LocalAI hackathon.
Architecture
WabiSabi-V1 is based on the Mistral-7B architecture. Key enhancements include:
- Expanded context window to 128k from the previous 8k.
- High-quality text generation in both Japanese and English.
- Ability to generate NSFW content.
- Improved memory retention for long-context generation.
Training
The model was trained using various methods, including:
- Chatvector for multiple models.
- Simple linear merging of result models.
- Domain and Sentence Enhancement with LORA.
- Context expansion.
The instruction format used is Vicuna-v1.1. Care should be taken with potential biases in the training data, and memory usage may be significant during long inferences.
Guide: Running Locally
To run WabiSabi-V1 locally, follow these steps:
- Setup Environment: Ensure you have the necessary libraries installed, such as Transformers.
- Download Model: Retrieve the model weights and configurations from the Hugging Face repository.
- Load Model: Use a supported library like Transformers or llamacpp for inference. llamacpp is recommended for better performance.
- Run Inference: Execute text generation tasks using your input data.
For optimal performance, using cloud GPUs such as those from AWS, Google Cloud, or Azure is recommended due to the model's high memory demands.
License
WabiSabi-V1 is licensed under the Apache-2.0 License.