TY - JOUR
T1 - HEaaN-STAT
T2 - A Privacy-Preserving Statistical Analysis Toolkit for Large-Scale Numerical, Ordinal, and Categorical Data
AU - Lee, Younho
AU - Seo, Jinyeong
AU - Nam, Yujin
AU - Chae, Jiseok
AU - Cheon, Jung Hee
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2024/5/1
Y1 - 2024/5/1
N2 - Statistical analysis of largescale data is useful as it enables the extraction of a large amount of information, despite its simplicity. Therefore, fusing and analyzing data from different security domains is an attractive and promising approach, unless it jeopardizes the privacy of the data in any security domain. In this study, we proposed the HEaaN-STAT toolkit that can efficiently fuse data from different domains to enable largescale statistical analysis while protecting data privacy. Moreover, we proposed an efficient inverse operation and a table lookup function for Cheon-Kim-Kim-Song (CKKS) encrypted data, as well as a data encoding method for counting encrypted data. Based on this, we proposed a method for generating a contingency table with a large number of cases and k-percentile for largescale data that is hundreds to thousands of times faster than the method proposed by Lu et al. in NDSS'17. The validity of the proposed toolkit was verified through practical use for business applications using real-world data.
AB - Statistical analysis of largescale data is useful as it enables the extraction of a large amount of information, despite its simplicity. Therefore, fusing and analyzing data from different security domains is an attractive and promising approach, unless it jeopardizes the privacy of the data in any security domain. In this study, we proposed the HEaaN-STAT toolkit that can efficiently fuse data from different domains to enable largescale statistical analysis while protecting data privacy. Moreover, we proposed an efficient inverse operation and a table lookup function for Cheon-Kim-Kim-Song (CKKS) encrypted data, as well as a data encoding method for counting encrypted data. Based on this, we proposed a method for generating a contingency table with a large number of cases and k-percentile for largescale data that is hundreds to thousands of times faster than the method proposed by Lu et al. in NDSS'17. The validity of the proposed toolkit was verified through practical use for business applications using real-world data.
KW - applied cryptography
KW - Homomorphic encryption
KW - information security
KW - privacy preserving statistical data analysis
UR - https://www.scopus.com/pages/publications/85162920679
U2 - 10.1109/TDSC.2023.3275649
DO - 10.1109/TDSC.2023.3275649
M3 - Article
AN - SCOPUS:85162920679
SN - 1545-5971
VL - 21
SP - 1224
EP - 1241
JO - IEEE Transactions on Dependable and Secure Computing
JF - IEEE Transactions on Dependable and Secure Computing
IS - 3
ER -