Luminum v0.1 123 B

FluffyKaeloky

Introduction

Luminum-123B is a merged model utilizing Mistral Large as a base, combined with Lumimaid-v0.2-123B and Magnum-v2-123B. This model aims to balance the strengths of its components, maintaining Mistral's "brain power" while integrating the lexicon of Lumimaid and creative flair from Magnum, resulting in a model suitable for generating coherent and detailed text.

Architecture

Luminum-123B is based on a merged architecture using the following components:

  • Base Model: Mistral Large
  • Merged Models: Lumimaid-v0.2-123B and Magnum-v2-123B
  • Configuration: Utilizes a della_linear merge method with specific weights and densities for each model component.

Training

The model was trained using a combination of the following parameters:

  • Weights and Densities: Lumimaid (weight: 0.34, density: 0.8), Magnum (weight: 0.19, density: 0.5)
  • Merge Method: della_linear
  • Base Model: mistralaiMistral-Large-Instruct-2407
  • Additional Parameters: epsilon (0.05), lambda (1), int8_mask (true), dtype (bfloat16)

Guide: Running Locally

To run Luminum-123B locally, follow these steps:

  1. Install Prerequisites:

    • Ensure Python and necessary libraries like transformers are installed.
  2. Download Model:

    • Access the model files from the repository, ensuring all dependencies are downloaded.
  3. Set Up Environment:

    • Configure your environment to use a high-performance GPU. Suggested cloud GPU providers include AWS, Google Cloud, and Azure.
  4. Run the Model:

    • Use the Hugging Face Transformers library to load and run the model locally with your specified input.
  5. Model Settings:

    • Recommended settings: Minp (0.08), Rep penalty (1.03), Rep penalty range (4096), Smoothing factor (0.23), No Repeat NGram Size (2).

License

The Luminum-123B model is subject to the licensing terms and conditions specified by the original model proprietors. Users should review these terms to ensure compliance with any restrictions on usage or distribution.

More Related APIs in Text Generation