TY - JOUR
T1 - MLKD-Net
T2 - Lightweight Single Image Dehazing via Multi-Head Large Kernel Attention
AU - Moon, Jiwon
AU - Park, Jongyoul
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/6
Y1 - 2025/6
N2 - Haze significantly degrades image quality by reducing contrast and blurring object boundaries, which impairs the performance of computer vision systems. Among various approaches, single-image dehazing remains particularly challenging due to the absence of depth information. While Vision Transformer (ViT)-based models have achieved remarkable results by leveraging multi-head attention and large effective receptive fields, their high computational complexity limits their applicability in real-time and embedded systems. To address this limitation, we propose MLKD-Net, a lightweight CNN-based model that incorporates a novel Multi-Head Large Kernel Block (MLKD), which is based on the Multi-Head Large Kernel Attention (MLKA) mechanism. This structure preserves the benefits of large receptive fields and a multi-head design while also ensuring compactness and computational efficiency. MLKD-Net achieves a PSNR of 37.42 dB on the SOTS-Outdoor dataset while using 90.9% fewer parameters than leading Transformer-based models. Furthermore, it demonstrates real-time performance with 55.24 ms per image (18.2 FPS) on the NVIDIA Jetson Orin Nano in TensorRT-INT8 mode. These results highlight its effectiveness and practicality for resource-constrained, real-time image dehazing applications.
AB - Haze significantly degrades image quality by reducing contrast and blurring object boundaries, which impairs the performance of computer vision systems. Among various approaches, single-image dehazing remains particularly challenging due to the absence of depth information. While Vision Transformer (ViT)-based models have achieved remarkable results by leveraging multi-head attention and large effective receptive fields, their high computational complexity limits their applicability in real-time and embedded systems. To address this limitation, we propose MLKD-Net, a lightweight CNN-based model that incorporates a novel Multi-Head Large Kernel Block (MLKD), which is based on the Multi-Head Large Kernel Attention (MLKA) mechanism. This structure preserves the benefits of large receptive fields and a multi-head design while also ensuring compactness and computational efficiency. MLKD-Net achieves a PSNR of 37.42 dB on the SOTS-Outdoor dataset while using 90.9% fewer parameters than leading Transformer-based models. Furthermore, it demonstrates real-time performance with 55.24 ms per image (18.2 FPS) on the NVIDIA Jetson Orin Nano in TensorRT-INT8 mode. These results highlight its effectiveness and practicality for resource-constrained, real-time image dehazing applications.
KW - convolutional neural networks
KW - multi-head large kernel attention
KW - real-time inference
KW - single-image dehazing
UR - https://www.scopus.com/pages/publications/105007679172
U2 - 10.3390/app15115858
DO - 10.3390/app15115858
M3 - Article
AN - SCOPUS:105007679172
SN - 2076-3417
VL - 15
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 11
M1 - 5858
ER -