TY - GEN
T1 - SP2Mask4D
T2 - 2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
AU - Park, Yongseok
AU - Tran, Duc Dang Trung
AU - Kim, Minho
AU - Kim, Hyeonseok
AU - Lee, Yeejin
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - The increasing need for precise segmentation in dynamic outdoor environments, particularly with LiDAR data, has brought attention to the 4D panoptic segmentation task. This task requires accurate identification of both objects and semantic labels across spatial and temporal dimensions. In this work, we present SP2Mask4D, a novel approach that replaces the commonly used transformer architecture with a superpoint-based transformer architecture. This modification leads to faster inference and reduced memory consumption, while maintaining competitive performance compared to transformer-based methods. While both approaches use attention mechanisms, traditional transformer models apply attention to all points, resulting in high computational costs. In contrast, SP2Mask4D focuses attention within localized superpoints, significantly lowering the computational burden. Experiments on the SemanticKITTI dataset show that SP2Mask4D reduces inference time by about 32.8% and improves memory efficiency by 60.3%, while preserving segmentation performance comparable to state-of-the-art methods.
AB - The increasing need for precise segmentation in dynamic outdoor environments, particularly with LiDAR data, has brought attention to the 4D panoptic segmentation task. This task requires accurate identification of both objects and semantic labels across spatial and temporal dimensions. In this work, we present SP2Mask4D, a novel approach that replaces the commonly used transformer architecture with a superpoint-based transformer architecture. This modification leads to faster inference and reduced memory consumption, while maintaining competitive performance compared to transformer-based methods. While both approaches use attention mechanisms, traditional transformer models apply attention to all points, resulting in high computational costs. In contrast, SP2Mask4D focuses attention within localized superpoints, significantly lowering the computational burden. Experiments on the SemanticKITTI dataset show that SP2Mask4D reduces inference time by about 32.8% and improves memory efficiency by 60.3%, while preserving segmentation performance comparable to state-of-the-art methods.
KW - 4D Panoptic segmentation
KW - Point Clouds
KW - Superpoint
KW - Transformer
UR - http://www.scopus.com/inward/record.url?scp=86000007651&partnerID=8YFLogxK
U2 - 10.1109/ICEIC64972.2025.10879637
DO - 10.1109/ICEIC64972.2025.10879637
M3 - Conference contribution
AN - SCOPUS:86000007651
T3 - 2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
BT - 2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 January 2025 through 22 January 2025
ER -