S A I N E M O re M I X
MoralianeSAINEMO-reMIX
Introduction
SAINEMO-reMIX is a merged model designed for text generation, particularly useful in role-play and conversational contexts. It supports both Russian and English languages and utilizes the transformers
library. The model is a combination of several pre-trained models created using the mergekit
tool.
Architecture
The model integrates various 12B parameter models, including:
- IlyaGusev/saiga_nemo_12b
- elinas/Chronos-Gold-12B-1.0
- Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24
- MarinaraSpaghetti/NemoMix-Unleashed-12B
The merging process employs the della_linear
method, prioritizing certain models for specific language support and role-play capabilities.
Training
The merge process uses specific weights and densities for each contributing model:
- IlyaGusev/saiga_nemo_12b: Emphasis on Russian language with a weight of 0.55.
- MarinaraSpaghetti/NemoMix-Unleashed-12B: Role-play focus, weight 0.2.
- elinas/Chronos-Gold-12B-1.0: Another role-play model with a weight of 0.15.
- Vikhrmodels/Vikhr-Nemo-12B-Instruct-R-21-09-24: Balances Russian support with a weight of 0.25.
The model relies on a 16-bit floating point precision (float16
) and uses the base model's tokenizer.
Guide: Running Locally
- Setup: Ensure you have Python and the necessary libraries, such as
transformers
. - Download: Obtain the model from the Hugging Face repository.
- Configuration: Set up the environment using the provided YAML configuration.
- Run: Execute the text generation script using your desired settings.
- Cloud GPUs: Consider using cloud services like AWS, Google Cloud, or Azure for GPU resources to enhance performance.
License
The SAINEMO-reMIX model is subject to the licenses of the base models used in its creation, which may include various open-source licenses. Users should consult the respective model pages for specific licensing information.