watt tool 70 B

watt-ai

Introduction

The watt-tool-70B model is a fine-tuned language model based on LLaMa-3.3-70B-Instruct. It is optimized for tool usage and multi-turn dialogue, achieving state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL).

Architecture

The watt-tool-70B model is designed for complex tool usage scenarios requiring multi-turn interactions, making it suitable for AI-powered workflow platforms like Lupan. It excels in understanding user requests, selecting appropriate tools, and utilizing them effectively across multiple conversation turns.

Key Features

  • Enhanced Tool Usage: Fine-tuned for precise and efficient tool selection and execution.
  • Multi-Turn Dialogue: Maintains context and uses tools effectively across multiple conversation turns.
  • State-of-the-Art Performance: Excels in function calling and tool usage on the BFCL.
  • Based on LLaMa-3.1-70B-Instruct: Leverages strong language understanding and generation capabilities.

Training

The model is trained using supervised fine-tuning on a specialized dataset for tool usage and multi-turn dialogue. The training incorporates Chain-of-Thought (CoT) techniques to generate high-quality data. The process is guided by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and employs SFT and DMPO methods to enhance performance in multi-turn agent tasks.

Guide: Running Locally

To run the watt-tool-70B model locally, follow these steps:

  1. Install Dependencies: Ensure you have the transformers library installed.
  2. Load Model and Tokenizer:
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_id = "watt-ai/watt-tool-70B"
    tokenizer = AutoTokenizer.from_pretrained(model_id)
    model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', device_map="auto")
    
  3. Prepare Input: Format your input using the provided system prompt template.
  4. Generate Output: Use the model to generate responses to your queries.

Consider using cloud GPUs for improved performance, especially if dealing with large datasets or complex queries.

License

The watt-tool-70B model is licensed under the Apache-2.0 License.

More Related APIs