Download PDFOpen PDF in browser

Automated Text Classification of Construction Inspection Report: A Small Samples Training Approach

15 pagesPublished: August 28, 2025

Abstract

Risk management is crucial for construction safety, but safety risk assessment often relies on experts' knowledge, which makes automatic risk management in engineering projects still a big challenge. Fortunately, for large-scale infrastructure construction, on-site inspection is required, and the conditions on-site are recorded in text format, which provides an opportunity to learn risk information from inspection reports. To improve document processing efficiency, automatic text classification plays an important role. However, currently, automatic text classification requires large scale training datasets. It is a big challenge for the engineering industry, especially for the fields which heavily rely on the experts’ knowledge, such as risk assessment. Limited data sources, high time and labor costs make it not practical to establish a large-scale dataset. This work proposes a BERT-based ensemble model for small-sample text classification, leveraging the Focal loss function to address data imbalance issues. Concurrently, an ensemble strategy is employed to enhance the model's generalization capabilities, while the learning rate gradient descent method is applied to mitigate the risk of model overfitting. The efficacy of the proposed framework is validated through a four-classification task about identifying risk levels based on the inspection reports of a metro construction project. The BERT-based ensemble model proposed in this paper achieves an accuracy of 96.24% on the test set, surpassing other pre-trained classification models and excelling in automated text classification tasks.

Keyphrases: bert, construction risk management, multi label text classification, small sample training

In: Jack Cheng and Yu Yantao (editors). Proceedings of The Sixth International Conference on Civil and Building Engineering Informatics, vol 22, pages 543-557.

BibTeX entry
@inproceedings{ICCBEI2025:Automated_Text_Classification_Construction,
  author    = {Kai Li and Chao Dong and Xueqing Fang and Da Li},
  title     = {Automated Text Classification of Construction Inspection Report: A Small Samples Training Approach},
  booktitle = {Proceedings of The Sixth International Conference on Civil and Building Engineering Informatics},
  editor    = {Jack Cheng and Yu Yantao},
  series    = {Kalpa Publications in Computing},
  volume    = {22},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2515-1762},
  url       = {/publications/paper/JKK8},
  doi       = {10.29007/24bp},
  pages     = {543-557},
  year      = {2025}}
Download PDFOpen PDF in browser