Arch Function 3 B LLM Model

Introduction

The Katanemo Arch-Function collection consists of state-of-the-art large language models (LLMs) designed for function calling tasks. These models excel at understanding complex function signatures, identifying required parameters, and producing accurate function call outputs based on natural language prompts. The collection achieves performance on par with GPT-4, making it suitable for scenarios involving automated API interaction and function execution.

Architecture

The core LLM used in this collection is integrated into the open-source Arch Gateway. Key features include single, parallel, and multiple function calling capabilities, providing high generalization across various use cases, from API interactions to backend task automation.

Training

The models are built on top of the Qwen 2.5 architecture. Performance benchmarks have been evaluated on the Berkeley Function-Calling Leaderboard, demonstrating competitive results in function calling tasks.

Guide: Running Locally

Install Requirements: Ensure you have the latest version of the Hugging Face transformers library:
```
pip install transformers>=4.37.0
```

Model Loading: Use the provided Python code to load the model and tokenizer:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "katanemo/Arch-Function-3B"
model = AutoModelForCausalLM.from_pretrained(
    model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

Execute Example: Follow the single and multi-turn examples to perform function calling tasks using the model.
Cloud GPUs: For optimal performance, consider using cloud GPUs from providers such as AWS, Google Cloud, or Azure.