[CS.AI] Software Delegation Contracts: Measuring Reviewab...

AI coding agents are increasingly tasked with software assignments, modifying repositories under bounded authority and returning work packages for review. Prior work proposed the software delegation contract, which covers the task, authority, returned work package, and acceptance context as the unit of analysis for delegated coding work, but did not measure its effects. This paper reports a controlled pilot study on explicit delegation contracts for coding agents.

We constructed a dependency-free TypeScript API task environment with seeded defects and documentation gaps, authored ten tasks across five families, and executed 64 agent runs under three conditions: a realistic issue-style prompt, an explicit delegation contract, and a contract with a required evidence bundle. Each run was scored with hidden acceptance tests, mutation checks, and scope analysis, then reviewed by three independent condition-blinded reviewers using a fixed rubric, resulting in 192 reviews.

The results indicate that explicit contracts did not improve objective task outcomes: all 64 runs passed hidden acceptance checks with zero scope violations. However, they did enhance reviewability. Evidence sufficiency improved in 22 of 30 paired comparisons and worsened in none (+0.83 on a 5-point scale, p < 0.01).

Blogger's Review: This study highlights that while explicit delegation contracts did not enhance the objective outcomes of tasks, they significantly improved the effectiveness of reviews, underscoring the importance and necessity of review mechanisms in the deployment of AI coding agents.

[CS.AI] Software Delegation Contracts: Measuring Reviewability in AI Coding-Agent Work