Download PDFOpen PDF in browserVISION-RISK: Vision-Language Model for Risk Assessment in Explainable Autonomous Driving SystemsEasyChair Preprint 160149 pages•Date: March 23, 2026AbstractThe advancements of autonomous driving systems hinge on their ability to navigate complex environments while ensuring safety and transparency. The lack of explainability in the current technologies - the ability to provide clear, human-readable justifications for actions - undermines trust, complicates validation and hinders widespread adoption. In this paper, we introduce VISION-RISK, a vision-language model (VLM) designed for risk assessmenet and explainability in autonomous driving using a lightweight architecture, optimized for deployment on edge devices. To train the model, we developed a custom dataset combining real-world driving scenarios from Honda Driving Dataset (HDD) and extreme high-risk cases from Crash1500, augumented with synthetic annotations using Dolphins and refined via DeepSeek V3. VISION-RISK stands out through three key characteristics: the integration of danger level classification with natural language explanation generation, a lightweight architecture optimized for deployment on resource-constrained devices, and a strong emphasis on interpretability and safety to enhance trust in autonomous systems. Keyphrases: Explainability, VLM, autonomous driving
|

