Data-driven automatic classification model for construction accident cases using natural language processing with hyperparameter tuning

Louis Kumi, Jaewook Jeong, Jaemin Jeong

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

The construction industry, while vital to societal progress, is marred by a high incidence of accidents and injuries. Manual classification of accident cases is intensive and susceptible to human bias. This study addresses this challenge by developing an automated accident case classification system for the construction industry using Natural Language Processing and machine learning techniques. This study was conducted using the following steps: (1) Establishment of dataset (2) Korean Natural Language Processing (3) Selection of machine learning models (4) Model evaluation. The models exhibited competitive performance, demonstrating high accuracy, precision, and recall rates across all classification tasks. XGBoost outperformed NB, SVM, and KNN for accident type, facility type, and work type with accuracy of 0.80, 0.56, and 0.67, respectively. The results also provided insights into the factors influencing accident classification. This study contributes to construction safety by providing a data-driven foundation for safety decision-making, resource allocation, and benchmarking.

Original languageEnglish
Article number105458
JournalAutomation in Construction
Volume164
DOIs
StatePublished - Aug 2024

Keywords

  • Accident classification
  • Accident type
  • Facility type
  • Korean NLP
  • Machine learning
  • Work type

Fingerprint

Dive into the research topics of 'Data-driven automatic classification model for construction accident cases using natural language processing with hyperparameter tuning'. Together they form a unique fingerprint.

Cite this