Large language model (LLM) systems increasingly utilize uncertainty signals to allocate limited computation across verification, test-time scaling, tool execution, and other selective compute decisions. Such policies rely on a global signal comparability assumption: equal scores should carry comparable decision value across inputs. Using budgeted verification as a controlled diagnostic setting, we identify a failure mode of this assumption: uncertainty quality is heteroskedastic across cost strata, with some regions exhibiting near-random discriminability despite concentrating many errors.
Under an explicit local model, we characterize the resulting distortion of global allocation and show that its upper bound scales with cross-stratum signal-quality dispersion. We separate weak signals, optimization instability, and structural heterogeneity through a controlled intervention hierarchy: Threshold, MP-Adapt, MP-Strat, and a deliberately simple cost-stratified thresholding intervention (CST). Across MBPP and MATH using Qwen3-8B, LLaMA3-8B, and GPT-4o-mini, global online adaptation yields inconsistent gains over static thresholding; MP-Strat partially recovers performance, while CST improves hit rate by up to 17 percentage points in strongly heterogeneous settings without gradient updates. These results identify structural heterogeneity, rather than optimizer weakness alone, as the primary bottleneck in the observed settings. More broadly, misaligned feedback structure cannot always be repaired by stronger optimization.
Blogger's Review: This paper reveals the impact of heteroskedastic signals on optimization in budgeted LLM verification, emphasizing the significance of structural heterogeneity. Through various interventions, the researchers demonstrate how to enhance model performance, providing new insights and directions for future LLM applications in heterogeneous environments. Notably, optimization strategies depend not only on the algorithms themselves but also on the quality and structural characteristics of the signals.