Nemo Mix Unleashed 12 B
MarinaraSpaghettiIntroduction
The NemoMix-Unleashed-12B is a text generation model developed by merging several pre-trained language models. It is optimized for tasks like role-playing (RP) and storytelling, showing reduced repetition in higher context lengths. The development acknowledges contributions from MistralAI, Intervitens, Sao10K, and Nbeerbower.
Architecture
The model is built using the mergekit tool, employing a della_linear merge method. It integrates multiple models, including:
- Intervitens Mini-Magnum-12b-v1.1
- Nbeerbower Mistral-Nemo-Bophades-12B
- Sao10K MN-12B-Lyra-v1
- Nbeerbower Mistral-Nemo-Gutenberg-12B
- MistralaiMistral-Nemo-Instruct-2407
The base model is the MistralaiMistral-Nemo-Base-2407, using a bfloat16 data type and a specific tokenizer.
Training
The model was trained with configurations that assign different weights and densities to each of the merged components. Parameters include:
- Epsilon: 0.05
- Lambda: 1
These parameters fine-tune the merging process to balance the contributions of each component.
Guide: Running Locally
- Environment Setup: Ensure you have Python and necessary libraries like
transformers
installed. - Clone Repository: Obtain the model from Hugging Face.
- Load Model: Use the
transformers
library to load the model and tokenizer. - Run Inference: Use recommended parameters such as Temperature (1.0-1.25) with Top A (0.1) or Min P (0.01-0.1).
For optimal performance, especially with model sizes like 12B, consider using cloud GPU services like AWS, Google Cloud, or Azure.
License
The model and its components are shared under licenses specified on their respective pages on Hugging Face. Ensure compliance with each component's licensing terms when using the model.