TY - JOUR
T1 - Fusing Higher-Order Features in Graph Neural Networks for Skeleton-Based Action Recognition
AU - Qin, Zhenyue
AU - Liu, Yang
AU - Ji, Pan
AU - Kim, Dongwoo
AU - Wang, Lei
AU - McKay, R. I.
AU - Anwar, Saeed
AU - Gedeon, Tom
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024/4/1
Y1 - 2024/4/1
N2 - Skeleton sequences are lightweight and compact and thus are ideal candidates for action recognition on edge devices. Recent skeleton-based action recognition methods extract features from 3-D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance. The use of first- and second-order features, that is, joint and bone representations, has led to high accuracy. Nonetheless, many models are still confused by actions that have similar motion trajectories. To address these issues, we propose fusing higher-order features in the form of angular encoding (AGE) into modern architectures to robustly capture the relationships between joints and body parts. This simple fusion with popular spatial-temporal graph neural networks achieves new state-of-the-art accuracy in two large benchmarks, including NTU60 and NTU120, while employing fewer parameters and reduced run time. Our source code is publicly available at: https://github.com/ZhenyueQin/Angular-Skeleton-Encoding.
AB - Skeleton sequences are lightweight and compact and thus are ideal candidates for action recognition on edge devices. Recent skeleton-based action recognition methods extract features from 3-D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance. The use of first- and second-order features, that is, joint and bone representations, has led to high accuracy. Nonetheless, many models are still confused by actions that have similar motion trajectories. To address these issues, we propose fusing higher-order features in the form of angular encoding (AGE) into modern architectures to robustly capture the relationships between joints and body parts. This simple fusion with popular spatial-temporal graph neural networks achieves new state-of-the-art accuracy in two large benchmarks, including NTU60 and NTU120, while employing fewer parameters and reduced run time. Our source code is publicly available at: https://github.com/ZhenyueQin/Angular-Skeleton-Encoding.
KW - Feature extraction
KW - graph neural network
KW - skeleton-based action recognition
UR - http://www.scopus.com/inward/record.url?scp=85139396548&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2022.3201518
DO - 10.1109/TNNLS.2022.3201518
M3 - Article
SN - 2162-237X
VL - 35
SP - 4783
EP - 4797
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 4
ER -