L3.3 Prikol 70 B v0.1a
NohobbyIntroduction
The L3.3-Prikol-70B-v0.1a model is a merged creation of various Llama 3.3 models. It is designed for text generation tasks and utilizes the transformers library. The model has been developed using the linear DELLA merge method and incorporates several existing models into a single framework.
Architecture
The L3.3-Prikol-70B-v0.1a model is a composite of several foundational models:
- Base Model: TheDrummer/Anubis-70B-v1
- Merged Models:
- EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0
- Blackroot/Mirai-3.0-70B
- Sao10K/L3.3-70B-Euryale-v2.3
The merge was executed using specific parameters such as epsilon (0.04), lambda (1.05), and various weights and densities for each model, with data types configured as bfloat16.
Training
The model was created through a merging process rather than traditional training. The DELLA linear merge method was employed, using TheDrummer/Anubis-70B-v1 as the base. Various configurations and parameters were applied to each model in the merge, including adjustments to weights, densities, and other settings to ensure optimal performance.
Guide: Running Locally
-
Setup Environment: Install the necessary libraries, primarily
transformers
, using pip:pip install transformers
-
Download Model: Clone or download the model files from Hugging Face's repository.
-
Load Model: Use the transformers library to load the model into your script:
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("Nohobby/L3.3-Prikol-70B-v0.1a") model = AutoModelForCausalLM.from_pretrained("Nohobby/L3.3-Prikol-70B-v0.1a")
-
Inference: Use the model to generate text by providing prompts and processing the outputs.
-
Hardware Recommendations: For optimal performance, using cloud GPUs such as those available on AWS, GCP, or Azure is recommended.
License
The model licensing details should be reviewed on its respective Hugging Face model card page. Ensure compliance with any usage terms or restrictions noted there.