Llama 3.2 Taiwan 3 B Instruct
lianghsunIntroduction
Llama-3.2-Taiwan-3B-Instruct is a language model designed for text generation and conversational tasks. Developed by Liang Hsun, it is fine-tuned with Traditional Chinese and multilingual dialogue datasets to enhance its knowledge and style relevant to Taiwan.
Architecture
The model is based on the LlamaForCausalLM architecture, fine-tuned from lianghsun/Llama-3.2-Taiwan-3B
. It supports multiple languages including Chinese, English, Italian, German, French, Japanese, and Korean. The model is designed to handle conversational tasks and is optimized for direct preference.
Training
The model was trained using a variety of datasets focusing on legal, medical, and general knowledge domains. It employs a multi-round instruction fine-tuning and direct preference optimization framework. The training utilized a distributed multi-GPU setup with specific hyperparameters such as a learning rate of 5e-05 and a batch size of 105, across 4 GPUs.
Guide: Running Locally
To run the model locally using Docker, follow these steps:
- Ensure you have Docker installed and configured with NVIDIA GPU support.
- Use the following command to run the model:
docker run --runtime nvidia --gpus all \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HUGGING_FACE_HUB_TOKEN=<secret>" \ -p 8000:8000 \ --ipc=host \ vllm/vllm-openai:latest \ --model lianghsun/Llama-3.2-Taiwan-3B-Instruct
- If using a different checkpoint, append
--revision <tag_name>
to the command.
For optimal performance, using cloud GPUs such as NVIDIA H100 NVL is recommended.
License
The model is released under the llama3.2
license. Users should refer to the license file for specific terms and conditions.