watt tool 70 B
watt-aiIntroduction
The watt-tool-70B
model is a fine-tuned language model based on LLaMa-3.3-70B-Instruct. It is optimized for tool usage and multi-turn dialogue, achieving state-of-the-art performance on the Berkeley Function-Calling Leaderboard (BFCL).
Architecture
The watt-tool-70B
model is designed for complex tool usage scenarios requiring multi-turn interactions, making it suitable for AI-powered workflow platforms like Lupan. It excels in understanding user requests, selecting appropriate tools, and utilizing them effectively across multiple conversation turns.
Key Features
- Enhanced Tool Usage: Fine-tuned for precise and efficient tool selection and execution.
- Multi-Turn Dialogue: Maintains context and uses tools effectively across multiple conversation turns.
- State-of-the-Art Performance: Excels in function calling and tool usage on the BFCL.
- Based on LLaMa-3.1-70B-Instruct: Leverages strong language understanding and generation capabilities.
Training
The model is trained using supervised fine-tuning on a specialized dataset for tool usage and multi-turn dialogue. The training incorporates Chain-of-Thought (CoT) techniques to generate high-quality data. The process is guided by principles from the paper "Direct Multi-Turn Preference Optimization for Language Agents" and employs SFT and DMPO methods to enhance performance in multi-turn agent tasks.
Guide: Running Locally
To run the watt-tool-70B
model locally, follow these steps:
- Install Dependencies: Ensure you have the
transformers
library installed. - Load Model and Tokenizer:
from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "watt-ai/watt-tool-70B" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype='auto', device_map="auto")
- Prepare Input: Format your input using the provided system prompt template.
- Generate Output: Use the model to generate responses to your queries.
Consider using cloud GPUs for improved performance, especially if dealing with large datasets or complex queries.
License
The watt-tool-70B
model is licensed under the Apache-2.0 License.