gemma 2 2b jpn it LLM Model

Introduction

Gemma-2-JPN is a text-to-text, decoder-only large language model from the Gemma series, fine-tuned for Japanese text. It is designed for a variety of text generation tasks such as question answering, summarization, and reasoning. The model maintains high performance comparable to English text models.

Architecture

Gemma models are part of the Gemini family, equipped with open weights and a text-to-text framework. They utilize the latest generation of Tensor Processing Unit (TPU) hardware for training, which offers advantages in performance, memory, scalability, and cost-effectiveness. The training utilizes JAX and ML Pathways for efficient large-scale model training.

Training

Gemma-2-JPN was trained on a diverse 8 trillion token dataset, including web documents, code, mathematics, and multilingual instruction data. Rigorous filtering processes were applied to remove harmful and sensitive content. The model was fine-tuned using TPUs and evaluated using a comprehensive set of Japanese prompts compared against GPT-3.5.

Guide: Running Locally

Basic Steps

Install Dependencies:

pip install -U transformers
pip install accelerate

Run the Model:

from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="google/gemma-2-2b-jpn-it",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda",  # Use "mps" for Mac
)

messages = [{"role": "user", "content": "マシーンラーニングについての詩を書いてください。"}]
outputs = pipe(messages, return_full_text=False, max_new_tokens=256)
print(outputs[0]["generated_text"].strip())

GPU Usage: Gemma-2-JPN can be run on a single or multiple GPUs using different precision settings such as bfloat16 or float32.

Cloud GPUs

For enhanced performance, consider using cloud GPU services like Google Cloud's TPU, AWS, or Azure.

License

The Gemma-2-JPN model is released under Google's usage license, which requires review and agreement prior to access. Ensure compliance with the Gemma Prohibited Use Policy and adhere to guidelines for responsible usage.

More Related APIs in Text Generation