TY - JOUR
T1 - Fuzzy anomaly scores for Isolation Forest
AU - Kim, Kyoungok
N1 - Publisher Copyright:
© 2024
PY - 2024/11
Y1 - 2024/11
N2 - The detection of anomalies in data presents a significant challenge in various applications. The Isolation Forest (IF) has gained attention due to its notable performance features, including high accuracy, efficiency, simplicity, and rapid computation. However, prior research has primarily concentrated on the constructing isolation trees (iTrees), overlooking the considerable influence of anomaly scoring methods on anomaly detection performance. This study introduces an innovative anomaly scoring method that integrates fuzzy concepts to enhance detection performance. Fuzzy concepts adeptly manage ambiguity and uncertainty, making them more readily applicable in anomaly scoring than in iTree training. Unlike conventional methods that assign a sample to a single child node in the decision path, the proposed fuzzy anomaly scoring method allows samples to be assigned to all child nodes with varying membership degrees based on the target sample. Consequently, this method aggregates the path lengths of all external nodes in a weighted manner, minimizing the impact of irrelevant splits on anomaly scores. Considering that even IF algorithms using informative splits select splits based on data distribution rather than label information, introducing fuzziness to the splits themselves can effectively mitigate performance degradation caused by irrelevant splits. Extensive experiments on 25 benchmark datasets demonstrated that the proposed anomaly scoring method significantly improved both the performance and stability of anomaly detection with the base IF algorithm, outperforming other IF algorithms and fuzzy rough set-based anomaly detection methods.
AB - The detection of anomalies in data presents a significant challenge in various applications. The Isolation Forest (IF) has gained attention due to its notable performance features, including high accuracy, efficiency, simplicity, and rapid computation. However, prior research has primarily concentrated on the constructing isolation trees (iTrees), overlooking the considerable influence of anomaly scoring methods on anomaly detection performance. This study introduces an innovative anomaly scoring method that integrates fuzzy concepts to enhance detection performance. Fuzzy concepts adeptly manage ambiguity and uncertainty, making them more readily applicable in anomaly scoring than in iTree training. Unlike conventional methods that assign a sample to a single child node in the decision path, the proposed fuzzy anomaly scoring method allows samples to be assigned to all child nodes with varying membership degrees based on the target sample. Consequently, this method aggregates the path lengths of all external nodes in a weighted manner, minimizing the impact of irrelevant splits on anomaly scores. Considering that even IF algorithms using informative splits select splits based on data distribution rather than label information, introducing fuzziness to the splits themselves can effectively mitigate performance degradation caused by irrelevant splits. Extensive experiments on 25 benchmark datasets demonstrated that the proposed anomaly scoring method significantly improved both the performance and stability of anomaly detection with the base IF algorithm, outperforming other IF algorithms and fuzzy rough set-based anomaly detection methods.
KW - Anomaly detection
KW - Anomaly score
KW - Fuzzy logic
KW - Isolation forest
KW - Membership degree
UR - http://www.scopus.com/inward/record.url?scp=85203412661&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2024.112193
DO - 10.1016/j.asoc.2024.112193
M3 - Article
AN - SCOPUS:85203412661
SN - 1568-4946
VL - 166
JO - Applied Soft Computing
JF - Applied Soft Computing
M1 - 112193
ER -