NeFut Logo NeFut
Admin Login

[CS.AI] ProvenanceGuard: Source-Aware Factuality Verification for MCP-Based LLM Agents

Published at: 2026-06-17 22:00 Last updated: 2026-06-20 13:47
#algorithm #AI #Machine Learning

Abstract

As tool-using LLM agents increasingly leverage the Model Context Protocol (MCP) to answer from heterogeneous evidence sources, including search, APIs, databases, clinical records, and formulary tools, standard factuality metrics typically assess whether an answer is supported by pooled evidence, missing a provenance-sensitive failure mode: a claim may be supported somewhere while being attributed to the wrong source. We call this cross-source conflation.

We introduce ProvenanceGuard, a source-aware verifier for MCP-grounded answers. It consumes captured MCP traces with stable tool IDs, source IDs, and raw outputs; decomposes answers into atomic claims; routes claims to source-specific evidence; checks support with NLI and a token-alignment proxy; compares stated attribution with the routed source; and returns per-claim verdicts plus an answer-level allow/block decision. Blocked answers can be repaired with retrieval-augmented answer revision and re-verified.

We evaluate on 281 medical-domain MCP-agent traces. A 266-trace adjudicated subset yields 2,325 LLM-assisted claim labels split by trace; 361 held-out labels are human-verified. On the 40-trace held-out split, ProvenanceGuard achieves block F1 0.802 and source accuracy 0.858 over 260 source-eligible claims, outperforming source-blind baselines that do not emit claim-to-source IDs. On a harder multi-source benchmark, it reaches block F1 0.846, while source-plus-relation accuracy drops to 0.229, showing that exact source ownership remains difficult with semantically close sources. Repair-and-reverify resolves all blocked answers in the full trace set, often via conservative fallback. In 50 controlled clinical conflation probes, ProvenanceGuard detects all injected attribution swaps with no retained wrong attribution. These results demonstrate that source attribution is an independent axis for factuality verification in MCP-based agents.

Blogger's Review: The development of ProvenanceGuard significantly enhances the factuality verification capability of MCP-based LLM agents when dealing with complex evidence sources. By meticulously tracking sources and validating claims, this system effectively addresses the attribution errors prevalent in traditional methods, showcasing its potential in high-stakes applications such as healthcare. Overall, this represents a vital contribution to the reliability of LLM agents and merits further research and promotion.

Original Source: https://arxiv.org/abs/2606.18037

[h] Back to Home