Towards a Rigorous Evaluation of Time-Series Anomaly Detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

170 Scopus citations

Abstract

In recent years, proposed studies on time-series anomaly detection (TAD) report high F1 scores on benchmark TAD datasets, giving the impression of clear improvements in TAD. However, most studies apply a peculiar evaluation protocol called point adjustment (PA) before scoring. In this paper, we theoretically and experimentally reveal that the PA protocol has a great possibility of overestimating the detection performance; even a random anomaly score can easily turn into a state-of-the-art TAD method. Therefore, the comparison of TAD methods after applying the PA protocol can lead to misguided rankings. Furthermore, we question the potential of existing TAD methods by showing that an untrained model obtains comparable detection performance to the existing methods even when PA is forbidden. Based on our findings, we propose a new baseline and an evaluation protocol. We expect that our study will help a rigorous evaluation of TAD and lead to further improvement in future researches.

Original languageEnglish
Title of host publicationAAAI-22 Technical Tracks 7
PublisherAssociation for the Advancement of Artificial Intelligence
Pages7194-7201
Number of pages8
ISBN (Electronic)1577358767, 9781577358763
DOIs
StatePublished - 30 Jun 2022
Event36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online
Duration: 22 Feb 20221 Mar 2022

Publication series

NameProceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
Volume36

Conference

Conference36th AAAI Conference on Artificial Intelligence, AAAI 2022
CityVirtual, Online
Period22/02/221/03/22

Fingerprint

Dive into the research topics of 'Towards a Rigorous Evaluation of Time-Series Anomaly Detection'. Together they form a unique fingerprint.

Cite this