Evaluating ChatGPT on Korea's BIM Expertise Exam and improving its performance through RAG

Youngsu Yu, Sihyun Kim, Wonbok Lee, Bonsang Koo

Research output: Contribution to journalArticlepeer-review

Abstract

This study aimed to evaluate ChatGPT-3.5 and ChatGPT-4's understanding of the building information modelling (BIM) knowledge domain by testing on the multiple-choice section of Korea's BIM Expertise Exam (KBEE), a professional license exam tailored for BIM. ChatGPT-4 achieved passing scores across all tested years and averaged 85%, 20% higher than its predecessor. Nevertheless, ChatGPT-4 had difficulty in certain areas including the 'BIM guidelines' subcategory, as the problems required access to documents specific to Korea's BIM policies and stipulations. By supplying the required documents using Retrieval Augmented Generation (RAG), GPT-4's score further improved to 88.6%, an improvement of 25.7%. The results provide evidence that ChatGPT-4, albeit within the context of the KBEE, is overall knowledgeable in its understanding of the BIM domain. The results of RAG demonstrate that partial gaps in its knowledge can be bolstered by providing the appropriate documents. The study verified that this capability was particularly valuable for categories that varied depending on local and regional contexts. The findings suggest that ChatGPT, when integrated with RAG, holds significant potential as a comprehensive and adaptive knowledge model in the BIM domain, and thereby alleviates the barriers for BIM adoption, such as the lack of experts, the cost of training, complexity of the BIM models, and additional workloads required during the BIM delivery process.

Original languageEnglish
Pages (from-to)94-120
Number of pages27
JournalJournal of Computational Design and Engineering
Volume12
Issue number4
DOIs
StatePublished - 1 Apr 2025

Keywords

  • building information modelling
  • Generative Pre-trained Transformer
  • knowledge assessment
  • large language model
  • professional exam
  • Retrieval Augmented Generation

Fingerprint

Dive into the research topics of 'Evaluating ChatGPT on Korea's BIM Expertise Exam and improving its performance through RAG'. Together they form a unique fingerprint.

Cite this