基于图卷积网络的强化学习特征选择方法
DOI:
作者:
作者单位:

太原理工大学

作者简介:

通讯作者:

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)(62171307), 山西省科技合作交流专项项目(202304041101035)


Graph convolutional network-based reinforcement learning for feature selection
Author:
Affiliation:

Taiyuan University of Technology

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)(62171307),Shanxi Province Science and Technology Cooperation and Exchange Special Project(202304041101035)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    由于信息技术的发展,各领域数据往往包含大量冗余特征,导致分类模型性能下降。特征选择是一种有效去除冗余特征的数据预处理技术,然而已有的特征选择方法无法在选择有效特征的同时保证模型较高的分类准确率。为了解决该问题,提出了一种基于图卷积网络的强化学习特征选择方法。以深度Q网络为基本框架,将特征选择问题映射为马尔可夫决策过程。首先,设计了一种基于图卷积网络的状态表示方法,将特征子集转换成图结构,以高效捕捉特征间关系;接着提出了一种综合考虑特征重要性、特征间相关性和分类性能的奖励函数引导智能体选择高Q值特征,以使得到的特征子集综合考虑几方面性能。在14个公共数据集和脑卒中筛查数据集上进行实验,结果表明,相比于已有的深度Q网络特征选择方法,所提方法在Congress数据集和脑卒中筛查数据集的准确率分别为99%和85%,分别提高了11%和3.6%,验证了该方法的有效性和可行性。

    Abstract:

    Due to the development of information technology, data in various fields often contain a large number of redundant features, which leads to the degradation of classification model performance. Feature selection is a data preprocessing technique that effectively removes redundant features, however, existing feature selection methods are unable to select effective features while ensuring high classification accuracy of the model. In order to solve this problem, a graph convolutional network-based reinforcement learning for feature selection is proposed. A deep Q-network is used as the basic framework to map the feature selection problem into a Markov decision process. Firstly, a state representation method based on graph convolutional network is designed to convert feature subsets into graph structures to capture inter-feature relationships efficiently. Then, a reward function considering feature importance, feature-to-feature correlation and classification performance is designed to guide the agent to select high Q-value features so that the obtained feature subset combines several aspects of performance. Experiments are performed on 14 public datasets and stroke screening datasets, and the results show that compared with the existing deep Q-network feature selection method, the accuracy of the proposed method on Congress dataset and the stroke screening dataset is 99% and 85%, which is an improvement of 11% and 3.6%, respectively, and verifies the validity and feasibility of the method.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2025-02-10
  • 最后修改日期:2025-03-07
  • 录用日期:2025-03-07
  • 在线发布日期: