TY - JOUR
T1 - A Data Imputation Approach for Missing Power Consumption Measurements in Water-Cooled Centrifugal Chillers
AU - Kim, Sung Won
AU - Kim, Young Il
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/6
Y1 - 2025/6
N2 - In the process of collecting operational data for the performance analysis of water-cooled centrifugal chillers, missing values are inevitable due to various factors such as sensor errors, data transmission failures, and failure of the measurement system. When a substantial amount of missing data is present, the reliability of data analysis decreases, leading to potential distortions in the results. To address this issue, it is necessary to either minimize missing occurrences by utilizing high-precision measurement equipment or apply reliable imputation techniques to compensate for missing values. This study focuses on two water-cooled turbo chillers installed in Tower A, Seoul, collecting a total of 118,464 data points over 3 years and 4 months. The dataset includes chilled water inlet and outlet temperatures ((Formula presented.) and (Formula presented.)) and flow rate ((Formula presented.)) and cooling water inlet and outlet temperatures ((Formula presented.) and (Formula presented.)) and flow rate ((Formula presented.)), as well as chiller power consumption ((Formula presented.)). To evaluate the performance of various imputation techniques, we introduced missing values at a rate of 10–30% under the assumption of a missing-at-random (MAR) mechanism. Seven different imputation methods—mean, median, linear interpolation, multiple imputation, simple random imputation, k-nearest neighbors (KNN), and the dynamically clustered KNN (DC-KNN)—were applied, and their imputation performance was validated using MAPE and CVRMSE metrics. The DC-KNN method, developed in this study, improves upon conventional KNN imputation by integrating clustering and dynamic weighting mechanisms. The results indicate that DC-KNN achieved the highest predictive performance, with MAPE ranging from 9.74% to 10.30% and CVRMSE ranging from 12.19% to 13.43%. Finally, for the missing data recorded in July 2023, we applied the most effective DC-KNN method to generate imputed values that reflect the characteristics of the studied site, which employs an ice thermal energy storage system.
AB - In the process of collecting operational data for the performance analysis of water-cooled centrifugal chillers, missing values are inevitable due to various factors such as sensor errors, data transmission failures, and failure of the measurement system. When a substantial amount of missing data is present, the reliability of data analysis decreases, leading to potential distortions in the results. To address this issue, it is necessary to either minimize missing occurrences by utilizing high-precision measurement equipment or apply reliable imputation techniques to compensate for missing values. This study focuses on two water-cooled turbo chillers installed in Tower A, Seoul, collecting a total of 118,464 data points over 3 years and 4 months. The dataset includes chilled water inlet and outlet temperatures ((Formula presented.) and (Formula presented.)) and flow rate ((Formula presented.)) and cooling water inlet and outlet temperatures ((Formula presented.) and (Formula presented.)) and flow rate ((Formula presented.)), as well as chiller power consumption ((Formula presented.)). To evaluate the performance of various imputation techniques, we introduced missing values at a rate of 10–30% under the assumption of a missing-at-random (MAR) mechanism. Seven different imputation methods—mean, median, linear interpolation, multiple imputation, simple random imputation, k-nearest neighbors (KNN), and the dynamically clustered KNN (DC-KNN)—were applied, and their imputation performance was validated using MAPE and CVRMSE metrics. The DC-KNN method, developed in this study, improves upon conventional KNN imputation by integrating clustering and dynamic weighting mechanisms. The results indicate that DC-KNN achieved the highest predictive performance, with MAPE ranging from 9.74% to 10.30% and CVRMSE ranging from 12.19% to 13.43%. Finally, for the missing data recorded in July 2023, we applied the most effective DC-KNN method to generate imputed values that reflect the characteristics of the studied site, which employs an ice thermal energy storage system.
KW - centrifugal chiller
KW - CVRSME
KW - data imputation
KW - DC-KNN
KW - MAPE
KW - performance analysis
UR - https://www.scopus.com/pages/publications/105007760737
U2 - 10.3390/en18112779
DO - 10.3390/en18112779
M3 - Article
AN - SCOPUS:105007760737
SN - 1996-1073
VL - 18
JO - Energies
JF - Energies
IS - 11
M1 - 2779
ER -