Llama 3 Swallow 8 B Instruct v0.1

tokyotech-llm

Introduction

The LLAMA-3-SWALLOW-8B-INSTRUCT-V0.1 model is part of the Llama 3 family, developed with a focus on enhancing Japanese language capabilities. It utilizes supervised fine-tuning (SFT) and Chat Vector, expanding its proficiency in both English and Japanese text generation tasks.

Architecture

The model is based on the Llama 3 architecture, leveraging the Megatron-LM library. It supports text generation pipelines and is built with transformers and safetensors libraries. For detailed architecture information, refer to the Llama 3 Model Card.

Training

The Swallow model underwent continual pre-training with a focus on Japanese language data. Instruction tuning utilized datasets like OpenAssistant Conversations. The model is in early development, with ongoing research to align outputs with human intent and safety.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the necessary packages:

    pip install vllm
    
  2. Load the model and tokenizer:

    from transformers import AutoTokenizer
    from vllm import LLM, SamplingParams
    
    model_name = "tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    llm = LLM(model=model_name, tensor_parallel_size=1)
    
  3. Set the sampling parameters and generate text:

    sampling_params = SamplingParams(temperature=0.6, top_p=0.9, max_tokens=512, stop="<|eot_id|>")
    message = [
        {"role": "system", "content": "あなたは誠実で優秀な日本人のアシスタントです。"},
        {"role": "user", "content": "東京の夜空に打ち上がっている花火の下、向かい合っている燕とラマの温かい物語を書いてください。"},
    ]
    prompt = tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=True)
    output = llm.generate(prompt, sampling_params)
    print(output[0].outputs[0].text)
    
  4. Consider using cloud GPUs: For optimal performance, especially on large models like this, using cloud GPUs such as those offered by AWS, Google Cloud, or Azure is recommended.

License

The LLAMA-3-SWALLOW-8B-INSTRUCT-V0.1 model is released under the META LLAMA 3 COMMUNITY LICENSE. For more details, visit the license page.

More Related APIs in Text Generation