Abstract
Haze significantly degrades image quality by reducing contrast and blurring object boundaries, which impairs the performance of computer vision systems. Among various approaches, single-image dehazing remains particularly challenging due to the absence of depth information. While Vision Transformer (ViT)-based models have achieved remarkable results by leveraging multi-head attention and large effective receptive fields, their high computational complexity limits their applicability in real-time and embedded systems. To address this limitation, we propose MLKD-Net, a lightweight CNN-based model that incorporates a novel Multi-Head Large Kernel Block (MLKD), which is based on the Multi-Head Large Kernel Attention (MLKA) mechanism. This structure preserves the benefits of large receptive fields and a multi-head design while also ensuring compactness and computational efficiency. MLKD-Net achieves a PSNR of 37.42 dB on the SOTS-Outdoor dataset while using 90.9% fewer parameters than leading Transformer-based models. Furthermore, it demonstrates real-time performance with 55.24 ms per image (18.2 FPS) on the NVIDIA Jetson Orin Nano in TensorRT-INT8 mode. These results highlight its effectiveness and practicality for resource-constrained, real-time image dehazing applications.
| Original language | English |
|---|---|
| Article number | 5858 |
| Journal | Applied Sciences (Switzerland) |
| Volume | 15 |
| Issue number | 11 |
| DOIs | |
| State | Published - Jun 2025 |
Keywords
- convolutional neural networks
- multi-head large kernel attention
- real-time inference
- single-image dehazing
Fingerprint
Dive into the research topics of 'MLKD-Net: Lightweight Single Image Dehazing via Multi-Head Large Kernel Attention'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver