Angel Slayer 12 B Unslop Mell R P Max D A R K N E S S v2

redrix

Introduction

AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-v2 is a merged pre-trained language model. It combines multiple models using techniques designed to enhance its performance for text generation tasks.

Architecture

The model is a product of merging several pre-trained models with a focus on enhancing language understanding through a specific architectural configuration. The merge uses the linear DELLA method, with TheDrummer/UnslopNemo-12B-v4 as the base. The configuration employs parameters such as self-attention weights and various density settings to balance the contributions of each model.

Models Merged

  • inflatebot/MN-12B-Mag-Mell-R1
  • DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
  • ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2

Training

The model configuration involves a YAML setup with specific parameters for each model:

  • Weights are assigned to different components like self-attention and MLP layers.
  • Density settings adjust the contribution intensity of each model.
  • Base Model: TheDrummer/UnslopNemo-12B-v4.
  • Merge Method: DELLA Linear.
  • Data Type: bfloat16.
  • Other Parameters: Include normalization and int8 masking, with specific values for epsilon and lambda.

Guide: Running Locally

  1. Install Dependencies: Ensure you have transformers and mergekit libraries installed.
  2. Download Model: Retrieve the model from the Hugging Face repository.
  3. Configure Environment: Set up your environment to use bfloat16 for efficient computation.
  4. Run Inference: Use the model for text generation tasks.

Suggested Cloud GPUs

  • Consider using GPUs such as NVIDIA A100 or V100 for optimal performance due to the model's size and complexity.

License

The model uses a license that aligns with Hugging Face's community standards and practices. Please refer to the specific license in the Hugging Face repository for detailed terms and conditions.

More Related APIs in Text Generation