Hermes 3 Llama 3.1 405 B Samantha LLM Model

Introduction

The Hermes-3-Llama-3.1-405B-Samantha model is built on the Hermes-3-Llama-3.1-405B-Uncensored foundation, utilizing the Samantha-newdataset-morelarge for training. It includes CoT thinking tags and roleplay training. This model operates under the Llama 3.1 license and is designed for uncensored use.

Architecture

The model employs a tokenizer of type AutoTokenizer and is structured to load in 4-bit precision. It uses a sequence length of 2048 with sample packing enabled. The architecture is enhanced with QLoRA adapters and specific training parameters to optimize its performance.

Training

The model was trained on RunPod, using 5 L40 GPUs, 160 vCPUs, and 1251 GiB RAM. It was trained with a learning rate of 1e-05 using the AdamW optimizer and a cosine learning rate scheduler. The batch size for training was 1, with gradient accumulation over 4 steps. Training involved one epoch with a warmup of 10 steps.

Training Results

The training resulted in a final loss of approximately 1.1848. Various checkpoints throughout the training process showed gradual improvement in loss and gradient norm metrics.

Guide: Running Locally

Setup Environment: Ensure you have Python installed along with the required libraries: transformers, torch, datasets, tokenizers, and peft.
Download Model: Obtain the model from the Hugging Face repository, using the model card link.

Load Model:

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("nicoboss/Hermes-3-Llama-3.1-405B-Samantha")
model = AutoModelForCausalLM.from_pretrained("nicoboss/Hermes-3-Llama-3.1-405B-Samantha")

Run Inference: Use the model's capabilities to generate or analyze text.
Utilize Cloud GPUs: For efficient performance, consider utilizing cloud GPU resources such as AWS, GCP, or Azure.

License

The Hermes-3-Llama-3.1-405B-Samantha model is distributed under the Llama 3.1 license. Users are responsible for ensuring compliance with the license terms and are advised to implement their own alignment layer due to the model's uncensored nature.

More Related APIs