[CS.AI] Scenario-Specific Safety Envelopes for VLA Driving

Safety certification of Vision-Language-Action (VLA) driving planners under ISO 21448 (SOTIF) relies on an Operational Design Domain (ODD) specification that addresses two complementary questions: when does the planner start to fail, and how severely does it fail? We evaluate Alpamayo R1, a 10B-parameter open-weight driving VLA, on 15,968 (clip, attack) pairs. Our findings reveal a conservative-aggregate gap: an aggregate safe threshold of $\sigma \leq 50$ under a 15% average displacement error (ADE) budget conceals well-sampled scenarios that can tolerate the top of the tested grid ($\sigma = 70$). A Gaussian Mixture Model (GMM) on the changed-explanation subset identifies six discrete severity bands (BIC-optimal $k{=}6$), indicating that two perturbation conditions with the same mean error can significantly differ in their share of high-severity (C4/C5) failures. Combining both analyses on the same corpus uncovers a finding not yielded in isolation: scenarios with the loosest noise thresholds do not correlate with the lowest high-severity rates; for instance, STOP_SIGNAL has roughly $4\times$ the C4/C5 share of LANE_KEEPING despite tolerating a larger $\sigma$. Hence, a deployable SOTIF ODD specification for driving VLAs necessitates a two-dimensional safety envelope instead of a single aggregate value per hazard.

Blogger's Review: This article highlights the complexities of safety performance in VLA driving systems through empirical analysis. The impact of varying scenarios and perturbation conditions on safety cannot be underestimated; future safety standards should adopt a more nuanced multidimensional assessment approach to ensure the reliability and safety of autonomous driving systems.