TY - JOUR
T1 - Product failure prediction with missing data
AU - Kang, Seokho
AU - Kim, Eunji
AU - Shim, Jaewoong
AU - Chang, Wonsang
AU - Cho, Sungzoon
N1 - Publisher Copyright:
© 2017, © 2017 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2018/7/18
Y1 - 2018/7/18
N2 - In production data, missing values commonly appear for several reasons including changes in measurement and inspection items, sampling inspections, and unexpected process events. When applied to product failure prediction, the incompleteness of data should be properly addressed to avoid performance degradation in prediction models. Well-known approaches for missing data treatment, such as elimination and imputation, would not perform well under usual scenarios in production data, including high missing rate, systematic missing and class imbalance. To address these limitations, here we present a method for predictive modelling with missing data by considering the characteristics of production data. It builds multiple prediction models on different complete data subsets derived from the original data-set, each of which has different coverage of instances and input variables. These models are selectively used to make predictions for new instances with missing values. We demonstrate the effectiveness of the proposed method through a case study using actual data-sets from a home appliance manufacturer.
AB - In production data, missing values commonly appear for several reasons including changes in measurement and inspection items, sampling inspections, and unexpected process events. When applied to product failure prediction, the incompleteness of data should be properly addressed to avoid performance degradation in prediction models. Well-known approaches for missing data treatment, such as elimination and imputation, would not perform well under usual scenarios in production data, including high missing rate, systematic missing and class imbalance. To address these limitations, here we present a method for predictive modelling with missing data by considering the characteristics of production data. It builds multiple prediction models on different complete data subsets derived from the original data-set, each of which has different coverage of instances and input variables. These models are selectively used to make predictions for new instances with missing values. We demonstrate the effectiveness of the proposed method through a case study using actual data-sets from a home appliance manufacturer.
KW - data mining
KW - failure prediction
KW - missing value
KW - predictive modelling
KW - production data
UR - http://www.scopus.com/inward/record.url?scp=85035764978&partnerID=8YFLogxK
U2 - 10.1080/00207543.2017.1407883
DO - 10.1080/00207543.2017.1407883
M3 - Article
AN - SCOPUS:85035764978
SN - 0020-7543
VL - 56
SP - 4849
EP - 4859
JO - International Journal of Production Research
JF - International Journal of Production Research
IS - 14
ER -