TY - JOUR KW - Intangible cultural heritages KW - Labanotations KW - 3D human pose estimation KW - Estimation methods KW - feature interaction KW - Feature interactions KW - Generation method KW - generation of extended labanotation KW - Generation of extended labanotation KW - Human pose KW - Level-of-detail KW - spatial transformer KW - Spatial transformer KW - Template matching AU - XQ Cai AU - R Lu AU - PY Cheng AU - JL Yao AU - Y Hu AB - To address the issues of low accuracy in existing 3D human pose estimation (HPE) methods and the limited level of details in Labanotation, we propose an extended Labanotation generation method for intangible cultural heritage dance videos based on 3D HPE. First, a 2D human pose sequence of the performer is inputted along with spatial location embeddings, where multiple spatial transformer modules are employed to extract spatial features of human joints and generate cross-joint multiple hypotheses. Afterward, temporal features are extracted by a self-attentive module and the correlation between different hypotheses is learned using bilinear pooling. Finally, the 3D joint coordinates of the performer are predicted, which are matched with the corresponding extended Labanotation symbols using the Laban template matching method to generate extended Labanotation. Experimental results show that, compared with VideoPose and CrossFormer algorithms, the Mean Per Joint Position Error (MPJPE) of the proposed method is reduced by 3.7mm and 0.6mm, respectively on Human3.6M dataset, and the generated extended Labanotation can better describe the movement details compared with the basic Labanotation. AN - WOS:001057802000002 BT - International Journal of Pattern Recognition and Artificial Intelligence DA - 2023/08/29/ DO - 10.1142/S0218001423550121 N2 - To address the issues of low accuracy in existing 3D human pose estimation (HPE) methods and the limited level of details in Labanotation, we propose an extended Labanotation generation method for intangible cultural heritage dance videos based on 3D HPE. First, a 2D human pose sequence of the performer is inputted along with spatial location embeddings, where multiple spatial transformer modules are employed to extract spatial features of human joints and generate cross-joint multiple hypotheses. Afterward, temporal features are extracted by a self-attentive module and the correlation between different hypotheses is learned using bilinear pooling. Finally, the 3D joint coordinates of the performer are predicted, which are matched with the corresponding extended Labanotation symbols using the Laban template matching method to generate extended Labanotation. Experimental results show that, compared with VideoPose and CrossFormer algorithms, the Mean Per Joint Position Error (MPJPE) of the proposed method is reduced by 3.7mm and 0.6mm, respectively on Human3.6M dataset, and the generated extended Labanotation can better describe the movement details compared with the basic Labanotation. PY - 2023 SN - 0218-0014 T2 - International Journal of Pattern Recognition and Artificial Intelligence TI - An Extended Labanotation Generation Method Based on 3D Human Pose Estimation for Intangible Cultural Heritage Dance Videos ER -