TY - GEN
T1 - MSTA3D
T2 - 32nd ACM International Conference on Multimedia, MM 2024
AU - Tran, Duc Dang Trung
AU - Kang, Byeongkeun
AU - Lee, Yeejin
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/28
Y1 - 2024/10/28
N2 - Recently, transformer-based techniques incorporating superpoints have become prevalent in 3D instance segmentation. However, they often encounter an over-segmentation problem, especially noticeable with large objects. Additionally, unreliable mask predictions stemming from superpoint mask prediction further compound this issue. To address these challenges, we propose a novel framework called MSTA3D. It leverages multi-scale feature representation and introduces twin-attention mechanisms to effectively capture them. Furthermore, MSTA3D integrates a box query with a box regularizer, offering a complementary spatial constraint alongside semantic queries. Experimental evaluations on ScanNetV2, ScanNet200, and S3DIS datasets demonstrate that our approach surpasses state-of-the-art 3D instance segmentation methods.
AB - Recently, transformer-based techniques incorporating superpoints have become prevalent in 3D instance segmentation. However, they often encounter an over-segmentation problem, especially noticeable with large objects. Additionally, unreliable mask predictions stemming from superpoint mask prediction further compound this issue. To address these challenges, we propose a novel framework called MSTA3D. It leverages multi-scale feature representation and introduces twin-attention mechanisms to effectively capture them. Furthermore, MSTA3D integrates a box query with a box regularizer, offering a complementary spatial constraint alongside semantic queries. Experimental evaluations on ScanNetV2, ScanNet200, and S3DIS datasets demonstrate that our approach surpasses state-of-the-art 3D instance segmentation methods.
KW - 3d point cloud instance segmentation
KW - instance segmentation
KW - multi-scale feature representation
KW - vi- sion transformer
UR - https://www.scopus.com/pages/publications/85209813114
U2 - 10.1145/3664647.3680667
DO - 10.1145/3664647.3680667
M3 - Conference contribution
AN - SCOPUS:85209813114
T3 - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
SP - 1467
EP - 1475
BT - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
Y2 - 28 October 2024 through 1 November 2024
ER -