G W Q 9 B Preview
prithivMLmodsIntroduction
GWQ-9B-PREVIEW is part of the Gemma with Questions family, developed using technology from Google's Gemini models. These text-to-text, decoder-only large language models are designed for various tasks, including question answering, summarization, and reasoning. The GWQ models are English-based and utilize open weights for both pre-trained and instruction-tuned variants, built on the Gemma2forCasualLM architecture.
Architecture
- Transformer-Based Design: Utilizes self-attention mechanisms for processing input and capturing contextual relationships.
- Lightweight and Efficient: Fewer parameters make it suitable for resource-constrained environments.
- Modular Layers: Consists of encoder and decoder layers adaptable for tasks like text generation and summarization.
- Attention Mechanisms: Multi-head self-attention enhances the handling of long-range dependencies.
- Pre-training and Fine-Tuning: Pre-trained on large corpora and fine-tuned for specific tasks to improve performance.
- Scalability: Can scale according to application needs, balancing performance and resource use.
- Open-Source and Customizable: Allows modifications and extensions for specific use cases.
Training
GWQ is fine-tuned on the Chain of Continuous Thought Synthetic Dataset, enhancing its capabilities in reasoning and problem-solving. The model's architecture supports both pre-training on broad datasets and fine-tuning for domain-specific tasks.
Guide: Running Locally
- Install Dependencies:
pip install accelerate
- Load Model and Tokenizer:
from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/GWQ-9B-Preview") model = AutoModelForCausalLM.from_pretrained( "prithivMLmods/GWQ-9B-Preview", device_map="auto", torch_dtype=torch.bfloat16, )
- Generate Text:
input_text = "Write me a poem about Machine Learning." input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") outputs = model.generate(**input_ids, max_new_tokens=32) print(tokenizer.decode(outputs[0]))
- Cloud GPU Suggestion: For optimal performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.
License
The model is licensed under the Gemma license.