[CS.AI] Assessing LLM Safety in Automotive Contexts

This paper appraises frameworks in AI development for integrating large language models (LLMs) into control tasks within automotive contexts, focusing on safety assurance. With the rapid integration of LLMs in automotive settings, this work reveals significant challenges that limit their efficacy in real-time safety-critical environments.

Firstly, we consider conceptual challenges, particularly the dual challenge faced by deployers in assuring a model developed upstream by large AI labs in a downstream context, specifically within vehicle architectures. Secondly, we analyze concrete challenges from existing standards.

For instance, fundamental engineering constraints covered in ISO21448, such as latency, and novel LLM-specific issues like alignment-related concerns in ISO/PAS8800 are highlighted. We ground our discussion in a concrete introductory experimental case study exploring an existing open-source repository, Talk2Drive, and present a safety argument to make explicit the limitations of current solutions. Nonetheless, as the use of LLMs in automotive contexts is being technically explored and operationalized, we propose potential assurance mechanisms for LLM-related hazardous events going forward.

Blogger's Review: This paper provides an in-depth examination of the complexities involved in applying LLMs to automotive safety, highlighting the limitations of existing frameworks. As technology advances, ensuring the safety of LLMs will be a critical area for future research.