Nemo Mix Unleashed 12 B

MarinaraSpaghetti

Introduction

The NemoMix-Unleashed-12B is a text generation model developed by merging several pre-trained language models. It is optimized for tasks like role-playing (RP) and storytelling, showing reduced repetition in higher context lengths. The development acknowledges contributions from MistralAI, Intervitens, Sao10K, and Nbeerbower.

Architecture

The model is built using the mergekit tool, employing a della_linear merge method. It integrates multiple models, including:

  • Intervitens Mini-Magnum-12b-v1.1
  • Nbeerbower Mistral-Nemo-Bophades-12B
  • Sao10K MN-12B-Lyra-v1
  • Nbeerbower Mistral-Nemo-Gutenberg-12B
  • MistralaiMistral-Nemo-Instruct-2407

The base model is the MistralaiMistral-Nemo-Base-2407, using a bfloat16 data type and a specific tokenizer.

Training

The model was trained with configurations that assign different weights and densities to each of the merged components. Parameters include:

  • Epsilon: 0.05
  • Lambda: 1

These parameters fine-tune the merging process to balance the contributions of each component.

Guide: Running Locally

  1. Environment Setup: Ensure you have Python and necessary libraries like transformers installed.
  2. Clone Repository: Obtain the model from Hugging Face.
  3. Load Model: Use the transformers library to load the model and tokenizer.
  4. Run Inference: Use recommended parameters such as Temperature (1.0-1.25) with Top A (0.1) or Min P (0.01-0.1).

For optimal performance, especially with model sizes like 12B, consider using cloud GPU services like AWS, Google Cloud, or Azure.

License

The model and its components are shared under licenses specified on their respective pages on Hugging Face. Ensure compliance with each component's licensing terms when using the model.

More Related APIs in Text Generation