引用本文:
【打印本页】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
过刊浏览    高级检索
本文已被:浏览 329次   下载 0  
分享到: 微信 更多
双流移位时空图卷积:热成像动作识别
王俊年
湖南科技大学
摘要:
本文针对目前主要依赖可见光研究动作识别的趋势,充分利用热红外图像对光线变化不敏感的优势,提出了一种利用人体骨骼识别热红外视频中动作的轻量级算法,本研究贡献有:该算法使用YOLOv7-tiny进行目标检测,Alphapose进行姿势估计和动态骨架建模,最后利用时空图卷积神经网络提取时空特征;算法避免将整个视频视为一个整体,提取个人在不同时间间隔的动作以提高鲁棒性;为提高动作识别的准确性引入了双流移位图卷积网络(2s-ShiftGCN).2s-ShiftGCN的实验中,在InfAR-skeleton数据集上的Top-1准确率为88.06%,Top-5准确率为98.28%.在过滤kinetics-skeleton数据集上的Top-1准确率为55.26%,Top-5准确率为83.98%.
关键词:  动作识别  热红外  时空图卷积网络  双流结构
DOI:
分类号:TP183
基金项目:国家自然科学基金项目(No.61973109)
Two-stream Shift Graph Convolutional Network: Thermal Infrared Action Recognition
Junnian Wang
Hunan University of Science and Technology
Abstract:
This study addresses the prevailing trend of relying predominantly on visible light for action recognition. Leveraging the advantage of infrared imagery's insensitivity to changes in lighting conditions, a lightweight algorithm for action recognition in thermal infrared videos is proposed. The contributions of this research include the utilization of YOLOv7-tiny for target detection, Alphapose for pose estimation and dynamic skeleton modeling, and GCN for the extraction of spatiotemporal features. By refraining from treating the entire video as a single entity and extracting individual actions at different time intervals, the algorithm significantly enhances robustness. To improve the accuracy of action recognition, a bistream shifted graph convolutional network (2s-ShiftGCN) is introduced. Experimental results of 2s-ShiftGCN show a Top-1 accuracy of 88.06% and a Top-5 accuracy of 98.28% on the InfAR-skeleton dataset. On the filtered kinetics-skeleton dataset, the Top-1 accuracy is 55.26%, and the Top-5 accuracy is 83.98%.
Key words:  Action recognition  Thermal infrared  Spatial-temporal graph convolutional networks  Two-stream structure