Angel Slayer 12 B Unslop Mell R P Max D A R K N E S S v2
redrixIntroduction
AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS-v2 is a merged pre-trained language model. It combines multiple models using techniques designed to enhance its performance for text generation tasks.
Architecture
The model is a product of merging several pre-trained models with a focus on enhancing language understanding through a specific architectural configuration. The merge uses the linear DELLA method, with TheDrummer/UnslopNemo-12B-v4 as the base. The configuration employs parameters such as self-attention weights and various density settings to balance the contributions of each model.
Models Merged
- inflatebot/MN-12B-Mag-Mell-R1
- DavidAU/MN-GRAND-Gutenberg-Lyra4-Lyra-12B-DARKNESS
- ArliAI/Mistral-Nemo-12B-ArliAI-RPMax-v1.2
Training
The model configuration involves a YAML setup with specific parameters for each model:
- Weights are assigned to different components like self-attention and MLP layers.
- Density settings adjust the contribution intensity of each model.
- Base Model: TheDrummer/UnslopNemo-12B-v4.
- Merge Method: DELLA Linear.
- Data Type: bfloat16.
- Other Parameters: Include normalization and int8 masking, with specific values for epsilon and lambda.
Guide: Running Locally
- Install Dependencies: Ensure you have
transformers
andmergekit
libraries installed. - Download Model: Retrieve the model from the Hugging Face repository.
- Configure Environment: Set up your environment to use
bfloat16
for efficient computation. - Run Inference: Use the model for text generation tasks.
Suggested Cloud GPUs
- Consider using GPUs such as NVIDIA A100 or V100 for optimal performance due to the model's size and complexity.
License
The model uses a license that aligns with Hugging Face's community standards and practices. Please refer to the specific license in the Hugging Face repository for detailed terms and conditions.