An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning
View PDF of the article A$^2$FM: Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning, by Qianben Chen, Jingyi Cao, Jiayu Zhang, Tianrui Qin, Xiaowan Li, King Zhu, and Dingfeng Shi He Zhu, Minghao Liu, Xiaobo Liang, Xin Gui, Ge Zhang, and Jian Yang. Yuchun Eleanor Jiang and Wangchunshu Zhou
View PDF HTML (beta)
a summary:LLMs fall into two families: the heuristics-focused LLM, which reinforces internal chain-of-reasoning but cannot call on external tools, and the instrumental LLM, which learns to interact with environments and make use of tools but often lags behind in deep thinking. This dichotomy arises from fundamentally different training goals, leading to a mismatch of strengths and inefficiencies in simple queries, with both families tending to over-think or over-use tools. In this work, we present the Adaptive Agent Foundation Model (A$^2$FM), a unified framework that follows the routing-then-alignment principle: the model first learns task-aware routing and then aligns mode-specific paths within the shared backbone. To address the inefficiency gap, we introduce a third mode – immediate – that handles simple queries directly, preventing unnecessary inference or tool calls while complementing the proxy and inference modes. To jointly enhance accuracy and efficiency, we propose adaptive policy optimization (APO), which enforces adaptive sampling across modes and applies a cost-regulating reward. On the 32B scale, A$^2$FM achieves 13.4% on BrowseComp, 70.4% on AIME25, and 16.7% on HLE, placing a new SOTA among comparable models and competitive performance with frontier LLMs across logical, Boolean, and common sense benchmarks. It is worth noting that the adaptive implementation achieves a passing cost of only $0.00487 per correct answer cutoff of 45.2% for the heuristic and 33.5% for the agent, thus achieving much higher cost efficiency while maintaining comparable accuracy.
Submission date
From: Qianbin Chen [view email]
[v1]
Monday, 13 October 2025, 17:08:25 UTC (1,242 KB)
[v2]
Thursday, 16 October 2025, 07:41:48 UTC (1,242 KB)
[v3]
Tuesday, 21 October 2025 03:44:09 UTC (1,242 KB)
Don’t miss more hot News like this! AI/" target="_blank" rel="noopener">Click here to discover the latest in AI news!
2025-10-22 04:00:00



