gemma 2 9b it tr
neuralworkGEMMA-2-9B-IT-TR
Introduction
Gemma-2-9b-it-tr is a finetuned version of the Google model gemma-2-9b-it
, specifically adapted for text generation and conversational tasks in Turkish. It is built on a dataset of 55,000 question-answering and conversational samples.
Architecture
The model is based on the gemma-2-9b-it
architecture and leverages the Transformers library for its implementation. It uses a text-generation pipeline, optimized to improve conversational and reasoning skills.
Training
The training involved supervised fine-tuning using LoRA with parameters rank=128
and lora_alpha=64
, conducted over 4 days on a single RTX 6000 Ada GPU. The dataset includes a filtered version of metedb/turkish_llm_datasets
and a small private dataset of 8,000 conversational samples.
Guide: Running Locally
To run the Gemma-2-9b-it-tr model locally, you will need Python and the Transformers library installed.
-
Import Libraries:
import torch from transformers import AutoModelForCausalLM, AutoTokenizer
-
Load the Model and Tokenizer:
model = AutoModelForCausalLM.from_pretrained( "neuralwork/gemma-2-9b-it-tr", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained("neuralwork/gemma-2-9b-it-tr")
-
Prepare and Generate Text:
messages = [ {"role": "user", "content": "Python'da bir öğenin bir listede geçip geçmediğini nasıl kontrol edebilirim?"}, ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) outputs = model.generate( tokenizer(prompt, return_tensors="pt").input_ids.to(model.device), max_new_tokens=1024, do_sample=True, temperature=0.7, top_p=0.9 ) response = tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):] print(response)
-
Suggested Cloud GPUs: Consider using cloud services like AWS or Google Cloud with GPU support for better performance and faster inference times.
License
The model is licensed under the gemma
license. Please review the license terms as applicable.