Llama3 8 B Chinese Chat
shenzhi-wangLlama3-8B-Chinese-Chat
Introduction
Llama3-8B-Chinese-Chat is an instruction-tuned language model designed for Chinese and English users. It is built upon the Meta-Llama-3-8B-Instruct model, aiming to enhance performance in roleplaying, tool usage, and mathematics. The model has been fine-tuned using ORPO (Odds Ratio Preference Optimization) to reduce issues like "Chinese questions with English answers."
Architecture
- Base Model: Meta-Llama-3-8B-Instruct
- Model Size: 8.03 billion parameters
- Context Length: 8192 tokens
- Languages Supported: English and Chinese
Training
The training of Llama3-8B-Chinese-Chat uses the LLaMA-Factory framework, featuring:
- Epochs: 2
- Learning Rate: 3e-6
- Scheduler: Cosine
- Warmup Ratio: 0.1
- Batch Size: 128 (global)
- Optimizer: Paged AdamW (32-bit)
- Fine-tuning: Full parameters
Guide: Running Locally
- Install Dependencies: Ensure you have Python and the
transformers
library installed. - Download the Model: Use the
transformers
library to download the model:from transformers import AutoTokenizer, AutoModelForCausalLM model_id = "shenzhi-wang/Llama3-8B-Chinese-Chat" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" )
- Run the Model: Use the tokenizer and model to generate text based on input prompts.
Cloud GPU Recommendation
For optimal performance, consider using cloud services that offer high-performance GPUs, such as AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure.
License
Llama3-8B-Chinese-Chat is distributed under the Llama-3 License. For more details, refer to the official license documentation here.