gemma 2 9 B it advanced v2.1
jsgreenawaltIntroduction
GEMMA-2-9B-IT-ADVANCED-V2.1 is an advanced version of the Gemma-2-9B-it model, merged from several fine-tuned models using a state-of-the-art merging technique. It is designed to enhance text generation with improved instruction following, plot tracking, and creative writing capabilities.
Architecture
The model is a combination of multiple pre-existing models: wzhouad/gemma-2-9b-it-WPO-HB, princeton-nlp/gemma-2-9b-it-SimPO, and UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3. The merging process utilized the 'della' method with specific parameters tuned for optimal performance. The base model used for this integration was google/gemma-2-9b-it.
Training
The merging process involved setting specific parameters for each model, including density and weight, to balance their contributions effectively. The models were configured with normalization, int8 masking, and specific lambda and epsilon values. The output uses a float16 data type.
Guide: Running Locally
- Clone the Repository: Begin by cloning the model repository from Hugging Face.
- Install Dependencies: Ensure you have the
transformers
library installed. - Load the Model: Use the
transformers
library to load the GEMMA-2-9B-IT-ADVANCED-V2.1 model. - Run Inference: Implement text generation tasks by providing prompts to the model.
For performance optimization, it's recommended to use cloud GPUs available from providers like AWS, Google Cloud, or Azure.
License
The GEMMA-2-9B-IT-ADVANCED-V2.1 model is provided under a license specified by the contributors. Ensure to review and comply with the license terms when using the model.