A Domain Knowledge-Enhanced Large Vision-Language Model for Construction Site Safety Monitoring

15 pages•Published: August 28, 2025

Chak-Fu Chan, Xiaowen Guo, Peter Kok-Yiu Wong, Jolly Pui-Ching Chan, Jack C.P. Cheng, Pak-Him Leung and Xingyu Tao

Abstract

To address the industry-wide and policy-driven requirements toward construction site safety monitoring, this paper develops a virtual assistant agent based on a large vision-language model (VLM), integrated into on-site surveillance camera system for real-time identification and alerting of unsafe worker behaviors. First, we designed a semi-automatic image-text labeling pipeline, employing in-context learning to enhance data annotation efficiency. Then, we established a two-stage curriculum learning paradigm to deeply embed construction domain knowledge into the VLM, which is eventually embedded into a real-time video analytical engine for safety compliance inspection and interactive visual question answering. The system has been deployed on a real construction site, with around 90% accuracy in identifying violations of work-at-height safety regulations.

Keyphrases: construction site safety monitoring, data efficient fine tuning strategy, domain tailored large vision language model, multi modal safety compliance checking, virtual construction safety assistant

In: Jack Cheng and Yu Yantao (editors). Proceedings of The Sixth International Conference on Civil and Building Engineering Informatics, vol 22, pages 894-908.

Links:	https://easychair.org/publications/paper/7fbx
	https://doi.org/10.29007/1n8g

BibTeX entry

@inproceedings{ICCBEI2025:Domain_Knowledge_Enhanced_Large,
  author    = {Chak-Fu Chan and Xiaowen Guo and Peter Kok-Yiu Wong and Jolly Pui-Ching Chan and Jack C.P. Cheng and Pak-Him Leung and Xingyu Tao},
  title     = {A Domain Knowledge-Enhanced Large Vision-Language Model for Construction Site Safety Monitoring},
  booktitle = {Proceedings of The Sixth International Conference on Civil and Building Engineering Informatics},
  editor    = {Jack Cheng and Yu Yantao},
  series    = {Kalpa Publications in Computing},
  volume    = {22},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2515-1762},
  url       = {/publications/paper/7fbx},
  doi       = {10.29007/1n8g},
  pages     = {894-908},
  year      = {2025}}

Download PDF Open PDF in browser