codegemma 7 B Manim Gen
thanhktIntroduction
The CODEGEMMA-7B-MANIMGEN model, developed by Thanhkt, is a fine-tuned text generation model based on the unsloth/codegemma-7b-it-bnb-4bit
model. It leverages the capabilities of the Hugging Face Transformers library and is optimized for faster training using the Unsloth library.
Architecture
The model architecture is built on top of Hugging Face's Transformers library, utilizing 4-bit precision for efficient computation. This setup enables the model to handle text generation tasks with lower resource consumption, maintaining performance while reducing computational overhead.
Training
The CODEGEMMA-7B-MANIMGEN model was trained at twice the usual speed by employing the Unsloth library and Hugging Face's TRL (Transformers Reinforcement Learning) library. This combination enhances the training process, providing a more efficient approach to model refinement.
Guide: Running Locally
- Clone the Repository: Clone the model repository from Hugging Face to your local environment.
- Install Dependencies: Ensure Python and essential libraries like
transformers
,torch
, andsafetensors
are installed. - Load the Model: Use the Hugging Face Transformers library to load the model and tokenizer.
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("thanhkt/codegemma-7B-ManimGen") model = AutoModelForCausalLM.from_pretrained("thanhkt/codegemma-7B-ManimGen")
- Run Inference: Input a text prompt and generate text using the model.
inputs = tokenizer("Your text here", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.decode(outputs[0]))
Cloud GPUs
For optimal performance, consider running the model on a cloud-based GPU service such as AWS, Google Cloud Platform, or Azure. This setup can handle the computational demands of model inference efficiently.
License
The CODEGEMMA-7B-MANIMGEN model is licensed under the Apache-2.0 License, which allows for both academic and commercial use, modification, and distribution.