[CS.AI] A Two-Stage Statistical Framework for Evaluating ...

As large language models (LLMs) are increasingly evaluated for bias using adaptations of human psychological paradigms, methodological limitations—particularly the conflation of refusal behavior with task performance—have hindered clear interpretation. Here, we adapt the Implicit Association Test (IAT) to a controlled, forced-choice framework and introduce a two-stage modeling approach that separates response compliance from task-consistent classification.

We evaluate associative interference across three contemporary LLMs (Claude Sonnet-4, Gemini 2.5 Pro, and GPT-5), defined as reduced task-consistency in incongruent relative to congruent conditions. While compliance with the structured response format was uniformly high, interference effects varied substantially across models and domains.

Claude Sonnet-4 exhibited strong interference in the Gender-Career domain (DeltaP = 0.086, 95% CrI [0.026, 0.173]) and smaller but credible effects in Gender-Science. Gemini 2.5 Pro showed attenuated interference, and GPT-5 exhibited minimal or no detectable interference across domains. These findings demonstrate that IAT-style associative asymmetries are not a universal property of LLMs but depend on model-specific characteristics.

By isolating interference from compliance and modeling item-level variability, this study provides a principled framework for evaluating structured response patterns in LLMs. The results highlight the importance of model-specific assessment and suggest that associative interference can be substantially mitigated in modern systems.

Blogger's Review: This study effectively evaluates associative interference in large language models through a two-stage modeling approach, revealing significant differences in how various models handle bias. It offers a fresh perspective for future research, emphasizing the importance of individual model characteristics in assessments.

[CS.AI] A Two-Stage Statistical Framework for Evaluating Associative Interference in LLMs