TY - JOUR
T1 - Augmented access pattern-based I/O performance prediction using directed acyclic graph regression
AU - Kumar, Manish
AU - Kim, Sunggon
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2025/2
Y1 - 2025/2
N2 - With the rise of big data processing is creating a new challenge i.e., keeping pace with the flow of information and efficient I/O performance is the key here. However, analyzing I/O performance is a complex task due to the many layers involved, from applications and libraries to the operating system, storage devices, and everything in between. In this paper, we propose a convolutional neural networks (CNN)-based directed acyclic graph regression (DAGR) network to predict the I/O performance of applications. The system first gathers I/O request information directly from the storage layer (block storage). This information is then converted into a visual representation (graph image) and augmented using various techniques to create additional training data. The core of the system is a CNN-based prediction model designed to identify potential I/O performance patterns by analyzing the generated graph images. Evaluations using real-world application benchmarks demonstrate that the proposed method can accurately predict the performance of various applications, including file servers, databases, mail servers, and video servers, with an accuracy of up to 99.73%.
AB - With the rise of big data processing is creating a new challenge i.e., keeping pace with the flow of information and efficient I/O performance is the key here. However, analyzing I/O performance is a complex task due to the many layers involved, from applications and libraries to the operating system, storage devices, and everything in between. In this paper, we propose a convolutional neural networks (CNN)-based directed acyclic graph regression (DAGR) network to predict the I/O performance of applications. The system first gathers I/O request information directly from the storage layer (block storage). This information is then converted into a visual representation (graph image) and augmented using various techniques to create additional training data. The core of the system is a CNN-based prediction model designed to identify potential I/O performance patterns by analyzing the generated graph images. Evaluations using real-world application benchmarks demonstrate that the proposed method can accurately predict the performance of various applications, including file servers, databases, mail servers, and video servers, with an accuracy of up to 99.73%.
KW - Deep learning
KW - Directed acyclic graph
KW - High performance computing
KW - Image augmentation
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85206356557&partnerID=8YFLogxK
U2 - 10.1007/s10586-024-04719-6
DO - 10.1007/s10586-024-04719-6
M3 - Article
AN - SCOPUS:85206356557
SN - 1386-7857
VL - 28
JO - Cluster Computing
JF - Cluster Computing
IS - 1
M1 - 4
ER -