watt tool 8 B

watt-ai

Introduction

WATT-TOOL-8B is a fine-tuned language model based on LLaMa-3.1-8B-Instruct, designed to excel in tool usage and multi-turn dialogue. It achieves state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL).

Architecture

The model is optimized for scenarios requiring complex tool usage and multi-turn interactions. It is particularly effective in applications like AI workflow building, demonstrated through platforms such as Lupan and Coze. Key features include enhanced tool usage, multi-turn dialogue capabilities, and state-of-the-art performance in function calling and tool usage.

Training

WATT-TOOL-8B is trained using supervised fine-tuning on a specialized dataset tailored for tool usage and multi-turn dialogue. The training utilizes Chain-of-Thought (CoT) techniques and follows principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents." Techniques such as Supervised Fine-Tuning (SFT) and Direct Multi-Turn Preference Optimization (DMPO) are employed to enhance its performance.

Guide: Running Locally

To use WATT-TOOL-8B locally, follow these steps:

  1. Install the required libraries:

    pip install transformers torch
    
  2. Load the model and tokenizer:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_id = "watt-ai/watt-tool-8B"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', device_map="auto")
    
  3. Prepare the input:

    • Define user queries and tool functions in JSON format.
    • Use the tokenizer to convert the input into tensors.
  4. Generate responses:

    outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
    print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
    

For optimal performance, it is suggested to use cloud GPUs, such as AWS EC2 with NVIDIA GPUs, Google Cloud's AI Platform, or Azure's GPU offerings.

License

WATT-TOOL-8B is licensed under the Apache-2.0 License.

More Related APIs