Hermes 3 Llama 3.1 405 B Samantha
nicobossIntroduction
The Hermes-3-Llama-3.1-405B-Samantha model is built on the Hermes-3-Llama-3.1-405B-Uncensored foundation, utilizing the Samantha-newdataset-morelarge for training. It includes CoT thinking tags and roleplay training. This model operates under the Llama 3.1 license and is designed for uncensored use.
Architecture
The model employs a tokenizer of type AutoTokenizer
and is structured to load in 4-bit precision. It uses a sequence length of 2048 with sample packing enabled. The architecture is enhanced with QLoRA adapters and specific training parameters to optimize its performance.
Training
The model was trained on RunPod, using 5 L40 GPUs, 160 vCPUs, and 1251 GiB RAM. It was trained with a learning rate of 1e-05 using the AdamW optimizer and a cosine learning rate scheduler. The batch size for training was 1, with gradient accumulation over 4 steps. Training involved one epoch with a warmup of 10 steps.
Training Results
The training resulted in a final loss of approximately 1.1848. Various checkpoints throughout the training process showed gradual improvement in loss and gradient norm metrics.
Guide: Running Locally
- Setup Environment: Ensure you have Python installed along with the required libraries:
transformers
,torch
,datasets
,tokenizers
, andpeft
. - Download Model: Obtain the model from the Hugging Face repository, using the model card link.
- Load Model:
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("nicoboss/Hermes-3-Llama-3.1-405B-Samantha") model = AutoModelForCausalLM.from_pretrained("nicoboss/Hermes-3-Llama-3.1-405B-Samantha")
- Run Inference: Use the model's capabilities to generate or analyze text.
- Utilize Cloud GPUs: For efficient performance, consider utilizing cloud GPU resources such as AWS, GCP, or Azure.
License
The Hermes-3-Llama-3.1-405B-Samantha model is distributed under the Llama 3.1 license. Users are responsible for ensuring compliance with the license terms and are advised to implement their own alignment layer due to the model's uncensored nature.