TY - JOUR
T1 - New knowledge extraction technique using probability for case-based reasoning
T2 - Application to medical diagnosis
AU - Park, Yoon Joo
AU - Kim, Byung Chun
AU - Chun, Se Hak
PY - 2006/2
Y1 - 2006/2
N2 - Case-based reasoning (CBR) has been used in various problem-solving areas such as financial forecasting, credit analysis and medical diagnosis. However, conventional CBR has the limitation that it has no criterion for choosing the nearest cases based on the probabilistic similarity of cases. It uses a fixed number of neighbors without considering an optimal number for each target case, so it does not guarantee optimal similar neighbors for various target cases. This leads to the weakness of lowering predictability due to deviation from desired similar neighbors. In this paper we suggest a new case extraction technique called statistical case-based reasoning. The main idea involves a dynamic adaptation of the optimal number of neighbors by considering the distribution of distances between potential similar neighbors for each target case. In order to do this, our technique finds the optimal distance threshold and selects similar neighbors satisfying the distance threshold criterion. We apply this new method to five real-life medical data sets and compare the results with those of the statistical method, logistic regression; we also compare the results with the learning methods C5.0, CART, neural networks and conventional CBR. The results of this paper show that the proposed technique outperforms those of many other methods, it overcomes the limitation of conventional CBR, and it provides improved classification accuracy.
AB - Case-based reasoning (CBR) has been used in various problem-solving areas such as financial forecasting, credit analysis and medical diagnosis. However, conventional CBR has the limitation that it has no criterion for choosing the nearest cases based on the probabilistic similarity of cases. It uses a fixed number of neighbors without considering an optimal number for each target case, so it does not guarantee optimal similar neighbors for various target cases. This leads to the weakness of lowering predictability due to deviation from desired similar neighbors. In this paper we suggest a new case extraction technique called statistical case-based reasoning. The main idea involves a dynamic adaptation of the optimal number of neighbors by considering the distribution of distances between potential similar neighbors for each target case. In order to do this, our technique finds the optimal distance threshold and selects similar neighbors satisfying the distance threshold criterion. We apply this new method to five real-life medical data sets and compare the results with those of the statistical method, logistic regression; we also compare the results with the learning methods C5.0, CART, neural networks and conventional CBR. The results of this paper show that the proposed technique outperforms those of many other methods, it overcomes the limitation of conventional CBR, and it provides improved classification accuracy.
KW - Artificial intelligence
KW - Case based reasoning
KW - Data mining
KW - Discriminant analysis
KW - Learning methods
KW - Logistic regression
KW - Neural network
KW - Optimal similar neighbors
KW - Probability
UR - http://www.scopus.com/inward/record.url?scp=33645031017&partnerID=8YFLogxK
U2 - 10.1111/j.1468-0394.2006.00321.x
DO - 10.1111/j.1468-0394.2006.00321.x
M3 - Article
AN - SCOPUS:33645031017
SN - 0266-4720
VL - 23
SP - 2
EP - 20
JO - Expert Systems
JF - Expert Systems
IS - 1
ER -