Syntax Error-Free and Generalizable Tool Use for LLMs: Related Work

2 Jun 2024


(1) Kexun Zhang, UC Santa Barbara and Equal contribution;

(2) Hongqiao Chen, Northwood High School and Equal contribution;

(3) Lei Li, Carnegie Mellon University;

(4) William Yang Wang,UC Santa Barbara.

Fine-tuning language models to use tools. Language models can be fine-tuned to use tools with data that contain interleaving text and tool use. Earlier studies make language models use a single tool like a retrieval module (Borgeaud et al., 2022; Guu et al., 2020) or a search engine (Nakano et al., 2021) by fine-tuning. Recent advances in tool-augmented language models that use multiple tools (Schick et al., 2023; Parisi et al., 2022) also fine-tune language models to use tools including QA models, translation models, calculators, and search engines. ToolkenGPT (Hao et al., 2023) proposes to use several special tokens to represent tools and only tunes the embeddings of the tokens so that new tool adoption can be more efficient. However, fine-tuning approaches for tool use still need new data and extra fine-tuning to adapt a model to new tools. We list the differences between finite-state decoding and the previous two paradigms in Table 1.

In-context learning for tool use. Language models can learn from in-context examples (Brown et al., 2020) and follow instructions (Ouyang et al., 2022). This makes it possible to simply put the descriptions of tools in the prompt and ask language models to use them. Recent works have used this possibility to use neural models (Shen et al., 2023), RESTful APIs (Qin et al., 2023; Song et al., 2023), program interpreters (Chen et al., 2022; Gao et al., 2023) and many other tools to solve problems. In-context learning does not need extra model tuning to use new tools. However, the description and documentation of new tools still need to be in the prompt, which increases computation cost and limits the context budget for the model to actually reason about the task.

Constrained decoding and finite-state machines. Previous constrained decoding methods mainly focus on lexical constraints (Anderson et al., 2017). They reduce the large search space of lexically constrained decoding with finite-state machines (Anderson et al., 2017), grouping together similar candidates (Hokamp & Liu, 2017), and better search algorithms (Miao et al., 2019; Lu et al., 2021; 2022). However, lexical constraints are not expressive enough to regulate tool calls. While finite-state machines have to be weighted and probabilistic to deal with the soft constraints in natural language (Eisner, 2002; Rastogi et al., 2016), the constraints for syntactic tool calls are hard constraints that are much easier for FSMs. Therefore, we propose TOOLDEC to meet the syntactic constraints of a valid tool call.

This paper is available on arxiv under CC 4.0 DEED license.