Abstract
Urbanization and industrialization pose significant challenges in promptly identifying and managing air pollution sources. The application of machine learning technology offers a promising solution to solve the issue. By analyzing multidimensional datasets containing a wide range of air pollutants, a machine learning approach has the potential to significantly improve air pollution management and facilitate source tracking. This study aims to comprehensively evaluate machine learning-based emission source classification models to provide insights into air pollution source tracking and management. Using 972 datasets consisting of five emission sources and 27 air pollutants, different classification models were implemented and subsequently compared: Random Forest (RF), Naïve Bayes Classifier (NBC), Support Vector Machine (SVM), Artificial Neural Network (ANN), and K-Nearest Neighbors (K-NN). The RF model was found to have better predictive performance than the other four models, achieving an accuracy of 0.9691 and a kappa value of 0.9537. Hydrogen chloride and acetaldehyde were the most important variables for classifying emission sources. The findings suggest the potential of machine learning techniques in addressing air pollution challenges, and the classifier model implemented in this study shows great promise for effective emission source identification.
Original language | English |
---|---|
Article number | 230222 |
Journal | Aerosol and Air Quality Research |
Volume | 24 |
Issue number | 7 |
DOIs | |
State | Published - Jul 2024 |
Keywords
- Air pollutants
- Classification
- Emission sources
- Machine learning