Mistral Nemo Japanese Instruct 2408

cyberagent

Introduction

The Mistral-Nemo-Japanese-Instruct-2408 model is a Japanese language model developed by CyberAgent. It is a continuation of the Mistral-Nemo-Instruct-2407 model and is designed for text generation tasks.

Architecture

This model is based on the Mistral framework and is optimized for both Japanese and English text generation. It incorporates modern architecture suitable for conversational AI applications.

Training

The model has been continually pre-trained to handle Japanese language tasks effectively. It leverages the Mistral-Nemo framework to enhance its capabilities in generating coherent and contextually relevant text.

Guide: Running Locally

  1. Installation

    • Ensure you have the latest version of the transformers library:
      pip install --upgrade transformers
      
  2. Load Model and Tokenizer

    • Utilize the following code to load the model and tokenizer:
      from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
      
      model = AutoModelForCausalLM.from_pretrained("cyberagent/Mistral-Nemo-Japanese-Instruct-2408", device_map="auto", torch_dtype="auto")
      tokenizer = AutoTokenizer.from_pretrained("cyberagent/Mistral-Nemo-Japanese-Instruct-2408")
      streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
      
  3. Run Inference

    • Prepare your input in the ChatML format and generate responses:
      messages = [
          {"role": "system", "content": "あなたは親切なAIアシスタントです。"},
          {"role": "user", "content": "AIによって私たちの暮らしはどのように変わりますか?"}
      ]
      
      input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
      output_ids = model.generate(input_ids, max_new_tokens=1024, temperature=0.5, streamer=streamer)
      
  4. Hardware Recommendations

    • For optimal performance, consider using cloud services with GPU instances such as AWS, Google Cloud, or Azure.

License

The model is distributed under the Apache-2.0 License, allowing for both commercial and non-commercial use.

More Related APIs in Text Generation