neural chat 7b v3 3

Intel

Introduction

The Neural-Chat-7B-V3-3 is a fine-tuned large language model (LLM) developed by Intel, specifically designed for a variety of language-related tasks. It is based on a 7 billion parameter architecture and optimized using Intel's Gaudi 2 processor. The model is particularly aligned for performance using the Direct Performance Optimization (DPO) method.

Architecture

Neural-Chat-7B-V3-3 originates from the Intel/neural-chat-7b-v3-1 model, which was fine-tuned from the Mistral-7B-v0.1 architecture. The model supports a context length of 8192 tokens. It has been trained and optimized using datasets such as MetaMathQA and Intel's own DPO pairs, making it suitable for tasks like text generation and mathematical problem solving.

Training

The model was trained on the MetaMathQA dataset with additional augmentation from GSM8k and MATH datasets. Training was conducted on Intel's Gaudi 2 processor, utilizing 8 cards. The model achieved competitive results on benchmarks like ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K.

Guide: Running Locally

  1. Setup:

    • Clone the repository:
      git clone https://github.com/intel/intel-extension-for-transformers.git
      cd intel-extension-for-transformers
      
    • Build the Docker container:
      docker build --no-cache ./ --target hpu --build-arg REPO=https://github.com/intel/intel-extension-for-transformers.git --build-arg ITREX_VER=main -f ./intel_extension_for_transformers/neural_chat/docker/Dockerfile -t chatbot_finetuning:latest
      
    • Run the Docker container:
      docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host chatbot_finetuning:latest
      
  2. Training:

    • Inside the Docker container, navigate to:
      cd examples/finetuning/finetune_neuralchat_v3
      
    • Execute the training script with DeepSpeed:
      deepspeed --include localhost:0,1,2,3,4,5,6,7 --master_port 29501 finetune_neuralchat_v3.py
      
  3. Cloud GPUs:

    • Consider using cloud-based GPU services, such as AWS EC2 with NVIDIA GPUs or Google Cloud's AI Platform, for efficient training and inference.

License

The Neural-Chat-7B-V3-3 model is released under the Apache 2.0 license, which allows for both personal and commercial use with proper attribution. Users are advised to consult legal guidance for commercial applications.

More Related APIs in Text Generation