적시적인 데이터 서비스 제공을 위한 유전자 알고리즘 기반 ETL 배치작업 스케줄링 최적화 연구

Translated title of the contribution: A Study on Optimization of ETL Batch Job Scheduling using Genetic Algorithm to Provide Data Service at The Right Time

Research output: Contribution to journalArticlepeer-review

Abstract

The era of big data has come with the development of the SNS and the IoT. In order to discover and utilize value from large amounts of data, analysis of data is essential. ETL should be performed for data analyzing. The ETL has thousands of jobs under a limited system resource and is complexly bundled and scheduled. Since the increase of data capacity by the change of business may cause the perfor-mance delay of the ETL, performance management is required. In this study, we address optimizing ETL by applying Genetic Algorithm which is a type of meta - heuristic. The objective function of the algorithm is to minimize the execution time of the ETL batch and to minimize the load on the server CPU resource. The data for the research is based on the 3 - month average CPU usage and execution time of the 260 ETL jobs being performed in the actual business. The algorithm is repeatedly performed by changing the parameters to obtain optimal results. This study is expected to be an important basis for optimizing the ETL operation of big data system.
Translated title of the contributionA Study on Optimization of ETL Batch Job Scheduling using Genetic Algorithm to Provide Data Service at The Right Time
Original languageKorean
Pages (from-to)71-84
Number of pages14
JournalEntrue Journal of Information Technology
Volume16
Issue number2
StatePublished - Dec 2017

Fingerprint

Dive into the research topics of 'A Study on Optimization of ETL Batch Job Scheduling using Genetic Algorithm to Provide Data Service at The Right Time'. Together they form a unique fingerprint.

Cite this