Trajectory Planning On Toolbench

Win rate

Results

Performance results of various models on this benchmark

Model Name	Win rate	Paper Title
Attention Bucket	71.5	Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
GPT4-TOPGUN	86.54	SwissNYF: Tool Grounded LLM Agents for Black Box Setting
GPT4- DFSDT	70.4	ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

0 of 3 row(s) selected.