In industrial assembly line disruption recovery, timely decisions are crucial under machine faults, worker absence, and emergency orders. Existing methods either depend on rigid handcrafted recovery logic or adaptive policies that fail to leverage heterogeneous external recovery knowledge at decision time, leading to abnormal recovery time (ART) and on-time delivery (OTD) issues. To bridge this gap, we propose a phase-aware guidance injection framework that enhances a trained recurrent MAPPO (RMAPPO) scheduling policy through logit-level action bias during evaluation. This framework offers a unified decision-time interface for rule-based, replay-based, and online LLM-based guidance, activating interventions only during abnormal and recovery phases. Experiments on a custom AssemblyLineEnv demonstrate that high-quality rule guidance yields the strongest gains, replay-based guidance degrades smoothly under imperfect availability, and online LLM guidance still provides useful intermediate improvements. These results indicate that decision-time guidance injection can exploit heterogeneous recovery hints without redesigning the actor.
Blogger's Review: This research introduces a novel approach to dynamic decision-making in industrial production by effectively integrating multiple guiding strategies through phase awareness, emphasizing the importance of real-time adjustments in complex environments. Future studies could explore optimizing these guiding strategies for broader applications.