文章摘要
基于标签特征自学习的多标签图像分类
Label-feature self-learning for multi-label image classification
投稿时间:2023-12-19  修订日期:2024-02-16
DOI:
中文关键词: 分类  多标签分类  特征融合  训练策略  编码器
英文关键词: Classification  Multi-label classification  Feature fusion  Training strategy  Transformer encoder
基金项目:国家自然科学(项目号:62271151),福建省自然科学(项目号:2021J01580).
作者单位邮编
吴晨 福州大学先进制造学院 福建 泉州 362251 362251
吴丽君* 福州大学先进制造学院 福建 泉州 362251 362251
摘要点击次数: 79
全文下载次数: 0
中文摘要:
      针对目前多标签图像分类的研究大多致力于提升图像特征提取能力,或训练能够捕获标签相关性的分类器,很少有利用标签特征进行图像分类的研究,提出一种标签特征自学习的多标签图像分类方法。首先,为引导标签特征自学习,提出一种特征融合模块,利用图像特征与标签特征在语义上的关联性引导标签特征的学习;其次,为增强图像特征提取能力,设计一种改进Transformer 编码器的图像特征增强模块,通过增加额外注意力操作更好地来学习图像的显著特征。此外,针对现有方法在训练时通常选择将图像裁剪后缩放或直接缩放至指定大小,忽略了这两种策略间的关联性,设计了一种新的训练策略,融合裁剪后缩放及直接缩放两种策略,使模型性能进一步提高。最后,在MS COCO 2014数据集和PASCAL VOC 2007数据集上进行了一系列实验,平均准确率分别达到了84.8%和94.5%,验证了模型的有效性。
英文摘要:
      Most of the current research on multi-label image classification focuses on improving the efficacy of image feature extraction or training classifiers capable of capturing label correlations, while few studies have been presented regarding image classification using label features. This paper proposes a method for multi-label image classification with self-learning label features. Firstly, to guide the self-learning process of label features, a feature fusion module is proposed that utilizes semantic correlations between image and label features. Secondly, an enhanced Transformer encoder-based module for image feature extraction is designed to better learn significant features by incorporating additional attention mechanisms. Additionally, existing methods often choose either scaling images after cropping or directly scaling them to a specified size during training without considering the correlation between these two strategies. To address this issue, a new training strategy is devised that integrates both scaling after cropping and direct scaling in order to further enhance model performance. Finally, extensive experiments are conducted on MS COCO 2014 dataset and PASCAL VOC 2007 dataset, achieving average accuracies of 84.8% and 94.5%, respectively, thereby validating the effectiveness of the proposed model.
View Fulltext   查看/发表评论  下载PDF阅读器
关闭