MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition

Kowovi Comivi Alowonou, Ji Hyeong Han

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Graph convolutional networks (GCNs) have been widely used and have achieved remarkable results in skeleton-based action recognition. We note that existing GCN-based approaches rely on local context information of the skeleton joints to construct adaptive graphs for feature aggregation, limiting their ability to understand actions that involve coordinated movements across various parts of the body. An adaptive graph built upon the global context information of the joints can help move beyond this limitation. Therefore, in this paper, we propose a novel approach to skeleton-based action recognition named Multi-stage Adaptive Graph Convolution Network (MSA-GCN). It consists of two modules: Multi-stage Adaptive Graph Convolution (MSA-GC) and Temporal Multi-Scale Transformer (TMST). These two modules work together to capture complex spatial and temporal patterns within skeleton data effectively. Specifically, MSA-GC explores both local and global context information of the joints across all sequences to construct the adaptive graph and facilitates the understanding of complex and nuanced relationships between joints. On the other hand, the TMST module integrates a Gated Multi-stage Temporal Convolution (GMSTC) with a Temporal Multi-Head Self-Attention (TMHSA) to capture global temporal features and accommodate both long-term and short-term dependencies within action sequences. Through extensive experiments on multiple benchmark datasets, including NTU RGB+D 60, NTU RGB+D 120, and Northwestern-UCLA, MSA-GCN achieves state-of-the-art performance and verifies its effectiveness in skeleton-based action recognition.

Original languageEnglish
Pages (from-to)193552-193563
Number of pages12
JournalIEEE Access
Volume12
DOIs
StatePublished - 2024

Keywords

  • GCN
  • Skeleton-based action recognition
  • dynamic graph topology
  • multi-scale temporal processing

Fingerprint

Dive into the research topics of 'MSA-GCN: Exploiting Multi-Scale Temporal Dynamics With Adaptive Graph Convolution for Skeleton-Based Action Recognition'. Together they form a unique fingerprint.

Cite this