[CS.AI] SANA Framework: Unveiling Key Issues for QA Agent...

Abstract

Exploratory question answering (EQA) over data lakes requires an LLM agent to discover relevant sources, analyze retrieved data, and adapt its actions based on intermediate results. End-to-end accuracy alone cannot distinguish failures in search, planning, data analysis, or the agent's Action Policy: its decisions about what to do next and when to submit an answer.

We present SANA (Search Agent Navigation Ablation framework), a diagnostic ablation framework that transforms EQA tasks into runtime profiles containing gold source sequence, sanitized subquestions, and execution records. SANA uses these profiles to construct idealized search, planning, and data-analysis tools, allowing each component to be ablated; the residual gap is diagnostic evidence for policy failures.

To illustrate SANA as a reusable evaluation framework, we adapted two recent EQA benchmarks, LakeQA and KramaBench, and evaluated lightweight and mid-sized agents under fixed prompts, budgets, data lakes, and runtimes. Across both benchmarks, data analysis is a consistent bottleneck while planning is less so. Search is a major limitation in LakeQA's large data-lake setting, but less so for the smaller-scale KramaBench. Thus, SANA deconstructs end-to-end task accuracies into a diagnosis of where data-lake agents fail, allowing for systematic comparisons of progress in search, planning, data analysis, and agent design.

Blogger's Review: The SANA framework reveals the limitations of existing EQA agents by analyzing each component within data lakes, offering a clear direction for future research. This approach enables a better understanding and optimization of question-answering system performance.

[CS.AI] SANA Framework: Unveiling Key Issues for QA Agents in Data Lakes

Abstract