gemma 2 2b jpn it
googleIntroduction
Gemma-2-JPN is a text-to-text, decoder-only large language model from the Gemma series, fine-tuned for Japanese text. It is designed for a variety of text generation tasks such as question answering, summarization, and reasoning. The model maintains high performance comparable to English text models.
Architecture
Gemma models are part of the Gemini family, equipped with open weights and a text-to-text framework. They utilize the latest generation of Tensor Processing Unit (TPU) hardware for training, which offers advantages in performance, memory, scalability, and cost-effectiveness. The training utilizes JAX and ML Pathways for efficient large-scale model training.
Training
Gemma-2-JPN was trained on a diverse 8 trillion token dataset, including web documents, code, mathematics, and multilingual instruction data. Rigorous filtering processes were applied to remove harmful and sensitive content. The model was fine-tuned using TPUs and evaluated using a comprehensive set of Japanese prompts compared against GPT-3.5.
Guide: Running Locally
Basic Steps
-
Install Dependencies:
pip install -U transformers pip install accelerate
-
Run the Model:
from transformers import pipeline import torch pipe = pipeline( "text-generation", model="google/gemma-2-2b-jpn-it", model_kwargs={"torch_dtype": torch.bfloat16}, device="cuda", # Use "mps" for Mac ) messages = [{"role": "user", "content": "マシーンラーニングについての詩を書いてください。"}] outputs = pipe(messages, return_full_text=False, max_new_tokens=256) print(outputs[0]["generated_text"].strip())
-
GPU Usage: Gemma-2-JPN can be run on a single or multiple GPUs using different precision settings such as bfloat16 or float32.
Cloud GPUs
For enhanced performance, consider using cloud GPU services like Google Cloud's TPU, AWS, or Azure.
License
The Gemma-2-JPN model is released under Google's usage license, which requires review and agreement prior to access. Ensure compliance with the Gemma Prohibited Use Policy and adhere to guidelines for responsible usage.