Llama3 8 B Chinese Chat

shenzhi-wang

Llama3-8B-Chinese-Chat

Introduction

Llama3-8B-Chinese-Chat is an instruction-tuned language model designed for Chinese and English users. It is built upon the Meta-Llama-3-8B-Instruct model, aiming to enhance performance in roleplaying, tool usage, and mathematics. The model has been fine-tuned using ORPO (Odds Ratio Preference Optimization) to reduce issues like "Chinese questions with English answers."

Architecture

  • Base Model: Meta-Llama-3-8B-Instruct
  • Model Size: 8.03 billion parameters
  • Context Length: 8192 tokens
  • Languages Supported: English and Chinese

Training

The training of Llama3-8B-Chinese-Chat uses the LLaMA-Factory framework, featuring:

  • Epochs: 2
  • Learning Rate: 3e-6
  • Scheduler: Cosine
  • Warmup Ratio: 0.1
  • Batch Size: 128 (global)
  • Optimizer: Paged AdamW (32-bit)
  • Fine-tuning: Full parameters

Guide: Running Locally

  1. Install Dependencies: Ensure you have Python and the transformers library installed.
  2. Download the Model: Use the transformers library to download the model:
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    model_id = "shenzhi-wang/Llama3-8B-Chinese-Chat"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(
        model_id, torch_dtype="auto", device_map="auto"
    )
    
  3. Run the Model: Use the tokenizer and model to generate text based on input prompts.

Cloud GPU Recommendation

For optimal performance, consider using cloud services that offer high-performance GPUs, such as AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure.

License

Llama3-8B-Chinese-Chat is distributed under the Llama-3 License. For more details, refer to the official license documentation here.

More Related APIs in Text Generation