TY - GEN
T1 - Towards Access Pattern Prediction for Big Data Applications
AU - Kim, Changjong
AU - Son, Yongseok
AU - Kim, Sunggon
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - The importance of data is becoming more and more prominent as modern applications produce a large amount of data. It is becoming common for applications to produce and process gigabytes or even terabytes of data. To improve the performance of data-intensive applications, the underlying storage systems utilize the I/O characteristics of applications such as access patterns to improve the storage performance. For example, existing storage schemes store frequently accessed data in high performance storage devices such as NVMe SSDs for low latency and stores rarely access data in low performance but high capacity storage devices such as tape storage for cost-efficiency. Thus, as the importance of data arises, it is important to understand the I/O characteristics of applications. In this paper, we propose an access pattern prediction scheme to understand the I/O characteristics of applications and utilize the characteristics for fast I/O processing. Our scheme uses the application history and machine learning algorithm to accurately predict the pattern. To do this, we first utilize a system log to collect access pattern data of applications. Then, by using the logs, we set up a machine learning based prediction model using the long short-term memory (LSTM) algorithm. Finally, when the application is executed repeatedly, we use the prediction model to predict the I/O requests of the application which can be used to improve the storage performance. Evaluation result using a real big data application shows that the proposed scheme can accurately predict the access pattern.
AB - The importance of data is becoming more and more prominent as modern applications produce a large amount of data. It is becoming common for applications to produce and process gigabytes or even terabytes of data. To improve the performance of data-intensive applications, the underlying storage systems utilize the I/O characteristics of applications such as access patterns to improve the storage performance. For example, existing storage schemes store frequently accessed data in high performance storage devices such as NVMe SSDs for low latency and stores rarely access data in low performance but high capacity storage devices such as tape storage for cost-efficiency. Thus, as the importance of data arises, it is important to understand the I/O characteristics of applications. In this paper, we propose an access pattern prediction scheme to understand the I/O characteristics of applications and utilize the characteristics for fast I/O processing. Our scheme uses the application history and machine learning algorithm to accurately predict the pattern. To do this, we first utilize a system log to collect access pattern data of applications. Then, by using the logs, we set up a machine learning based prediction model using the long short-term memory (LSTM) algorithm. Finally, when the application is executed repeatedly, we use the prediction model to predict the I/O requests of the application which can be used to improve the storage performance. Evaluation result using a real big data application shows that the proposed scheme can accurately predict the access pattern.
UR - https://www.scopus.com/pages/publications/85143253493
U2 - 10.1109/ICTC55196.2022.9952775
DO - 10.1109/ICTC55196.2022.9952775
M3 - Conference contribution
AN - SCOPUS:85143253493
T3 - International Conference on ICT Convergence
SP - 1577
EP - 1580
BT - ICTC 2022 - 13th International Conference on Information and Communication Technology Convergence
PB - IEEE Computer Society
T2 - 13th International Conference on Information and Communication Technology Convergence, ICTC 2022
Y2 - 19 October 2022 through 21 October 2022
ER -