neural chat 7b v3 3
IntelIntroduction
The Neural-Chat-7B-V3-3 is a fine-tuned large language model (LLM) developed by Intel, specifically designed for a variety of language-related tasks. It is based on a 7 billion parameter architecture and optimized using Intel's Gaudi 2 processor. The model is particularly aligned for performance using the Direct Performance Optimization (DPO) method.
Architecture
Neural-Chat-7B-V3-3 originates from the Intel/neural-chat-7b-v3-1 model, which was fine-tuned from the Mistral-7B-v0.1 architecture. The model supports a context length of 8192 tokens. It has been trained and optimized using datasets such as MetaMathQA and Intel's own DPO pairs, making it suitable for tasks like text generation and mathematical problem solving.
Training
The model was trained on the MetaMathQA dataset with additional augmentation from GSM8k and MATH datasets. Training was conducted on Intel's Gaudi 2 processor, utilizing 8 cards. The model achieved competitive results on benchmarks like ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K.
Guide: Running Locally
-
Setup:
- Clone the repository:
git clone https://github.com/intel/intel-extension-for-transformers.git cd intel-extension-for-transformers
- Build the Docker container:
docker build --no-cache ./ --target hpu --build-arg REPO=https://github.com/intel/intel-extension-for-transformers.git --build-arg ITREX_VER=main -f ./intel_extension_for_transformers/neural_chat/docker/Dockerfile -t chatbot_finetuning:latest
- Run the Docker container:
docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host chatbot_finetuning:latest
- Clone the repository:
-
Training:
- Inside the Docker container, navigate to:
cd examples/finetuning/finetune_neuralchat_v3
- Execute the training script with DeepSpeed:
deepspeed --include localhost:0,1,2,3,4,5,6,7 --master_port 29501 finetune_neuralchat_v3.py
- Inside the Docker container, navigate to:
-
Cloud GPUs:
- Consider using cloud-based GPU services, such as AWS EC2 with NVIDIA GPUs or Google Cloud's AI Platform, for efficient training and inference.
License
The Neural-Chat-7B-V3-3 model is released under the Apache 2.0 license, which allows for both personal and commercial use with proper attribution. Users are advised to consult legal guidance for commercial applications.