文章摘要
双流移位时空图卷积:热成像动作识别
Two-stream Shift Graph Convolutional Network: Thermal Infrared Action Recognition
投稿时间:2024-01-03  修订日期:2024-03-04
DOI:
中文关键词: 动作识别  热红外  时空图卷积网络  双流结构
英文关键词: Action recognition  Thermal infrared  Spatial-temporal graph convolutional networks  Two-stream structure
基金项目:国家自然科学基金项目(No.61973109)
作者单位邮编
王俊年* 湖南科技大学 411201
摘要点击次数: 84
全文下载次数: 0
中文摘要:
      本文针对目前主要依赖可见光研究动作识别的趋势,充分利用热红外图像对光线变化不敏感的优势,提出了一种利用人体骨骼识别热红外视频中动作的轻量级算法,本研究贡献有:该算法使用YOLOv7-tiny进行目标检测,Alphapose进行姿势估计和动态骨架建模,最后利用时空图卷积神经网络提取时空特征;算法避免将整个视频视为一个整体,提取个人在不同时间间隔的动作以提高鲁棒性;为提高动作识别的准确性引入了双流移位图卷积网络(2s-ShiftGCN).2s-ShiftGCN的实验中,在InfAR-skeleton数据集上的Top-1准确率为88.06%,Top-5准确率为98.28%.在过滤kinetics-skeleton数据集上的Top-1准确率为55.26%,Top-5准确率为83.98%.
英文摘要:
      This study addresses the prevailing trend of relying predominantly on visible light for action recognition. Leveraging the advantage of infrared imagery's insensitivity to changes in lighting conditions, a lightweight algorithm for action recognition in thermal infrared videos is proposed. The contributions of this research include the utilization of YOLOv7-tiny for target detection, Alphapose for pose estimation and dynamic skeleton modeling, and GCN for the extraction of spatiotemporal features. By refraining from treating the entire video as a single entity and extracting individual actions at different time intervals, the algorithm significantly enhances robustness. To improve the accuracy of action recognition, a bistream shifted graph convolutional network (2s-ShiftGCN) is introduced. Experimental results of 2s-ShiftGCN show a Top-1 accuracy of 88.06% and a Top-5 accuracy of 98.28% on the InfAR-skeleton dataset. On the filtered kinetics-skeleton dataset, the Top-1 accuracy is 55.26%, and the Top-5 accuracy is 83.98%.
View Fulltext   查看/发表评论  下载PDF阅读器
关闭