Constrained class-wise feature selection (CCFS)

Syed Fawad Hussain; Fatima Shahzadi; Badre Munir

doi:10.1007/s13042-022-01589-5

Constrained class-wise feature selection (CCFS)

Syed Fawad Hussain^*, Fatima Shahzadi, Badre Munir

^*Corresponding author for this work

Computer Science

Research output: Contribution to journal › Article › peer-review

Abstract

Feature selection plays a vital role as a preprocessing step for high dimensional data in machine learning. The basic purpose of feature selection is to avoid “curse of dimensionality” and reduce time and space complexity of training data. Several techniques, including those that use information theory, have been proposed in the literature as a means to measure the information content of a feature. Most of them incrementally select features with max dependency with the category but minimum redundancy with already selected features. A key missing idea in these techniques is the fair representation of features with max dependency among the different categories, i.e., skewed selection of features having high mutual information (MI) with a particular class. This can result in a biased classification in favor of that particular class while other classes have low matching scores during classification. We propose a novel approach based on information theory that selects features in a class-wise fashion rather than based on their global max dependency. In addition, a constrained search is used instead of a global sequential forward search. We prove that our proposed approach enhances Maximum Relevance while keeping Minimum Redundancy under a constrained search. Results on multiple benchmark datasets show that our proposed method improves accuracy as compared to other state-of-the-art feature selection algorithms while having a lower time complexity.

Original language	English
Pages (from-to)	3211–3224
Number of pages	14
Journal	International Journal of Machine Learning and Cybernetics
Volume	13
Issue number	10
Early online date	20 Jun 2022
DOIs	https://doi.org/10.1007/s13042-022-01589-5
Publication status	Published - Oct 2022

Keywords

Class-wise feature selection
Classification
Feature selection
Information theory

Access to Document

10.1007/s13042-022-01589-5

Cite this

@article{8453609d57c545a890a4df63857026e6,

title = "Constrained class-wise feature selection (CCFS)",

abstract = "Feature selection plays a vital role as a preprocessing step for high dimensional data in machine learning. The basic purpose of feature selection is to avoid “curse of dimensionality” and reduce time and space complexity of training data. Several techniques, including those that use information theory, have been proposed in the literature as a means to measure the information content of a feature. Most of them incrementally select features with max dependency with the category but minimum redundancy with already selected features. A key missing idea in these techniques is the fair representation of features with max dependency among the different categories, i.e., skewed selection of features having high mutual information (MI) with a particular class. This can result in a biased classification in favor of that particular class while other classes have low matching scores during classification. We propose a novel approach based on information theory that selects features in a class-wise fashion rather than based on their global max dependency. In addition, a constrained search is used instead of a global sequential forward search. We prove that our proposed approach enhances Maximum Relevance while keeping Minimum Redundancy under a constrained search. Results on multiple benchmark datasets show that our proposed method improves accuracy as compared to other state-of-the-art feature selection algorithms while having a lower time complexity.",

keywords = "Class-wise feature selection, Classification, Feature selection, Information theory",

author = "Hussain, {Syed Fawad} and Fatima Shahzadi and Badre Munir",

year = "2022",

month = oct,

doi = "10.1007/s13042-022-01589-5",

language = "English",

volume = "13",

pages = "3211–3224",

journal = "International Journal of Machine Learning and Cybernetics",

number = "10",

}

TY - JOUR

T1 - Constrained class-wise feature selection (CCFS)

AU - Hussain, Syed Fawad

AU - Shahzadi, Fatima

AU - Munir, Badre

PY - 2022/10

Y1 - 2022/10

N2 - Feature selection plays a vital role as a preprocessing step for high dimensional data in machine learning. The basic purpose of feature selection is to avoid “curse of dimensionality” and reduce time and space complexity of training data. Several techniques, including those that use information theory, have been proposed in the literature as a means to measure the information content of a feature. Most of them incrementally select features with max dependency with the category but minimum redundancy with already selected features. A key missing idea in these techniques is the fair representation of features with max dependency among the different categories, i.e., skewed selection of features having high mutual information (MI) with a particular class. This can result in a biased classification in favor of that particular class while other classes have low matching scores during classification. We propose a novel approach based on information theory that selects features in a class-wise fashion rather than based on their global max dependency. In addition, a constrained search is used instead of a global sequential forward search. We prove that our proposed approach enhances Maximum Relevance while keeping Minimum Redundancy under a constrained search. Results on multiple benchmark datasets show that our proposed method improves accuracy as compared to other state-of-the-art feature selection algorithms while having a lower time complexity.

AB - Feature selection plays a vital role as a preprocessing step for high dimensional data in machine learning. The basic purpose of feature selection is to avoid “curse of dimensionality” and reduce time and space complexity of training data. Several techniques, including those that use information theory, have been proposed in the literature as a means to measure the information content of a feature. Most of them incrementally select features with max dependency with the category but minimum redundancy with already selected features. A key missing idea in these techniques is the fair representation of features with max dependency among the different categories, i.e., skewed selection of features having high mutual information (MI) with a particular class. This can result in a biased classification in favor of that particular class while other classes have low matching scores during classification. We propose a novel approach based on information theory that selects features in a class-wise fashion rather than based on their global max dependency. In addition, a constrained search is used instead of a global sequential forward search. We prove that our proposed approach enhances Maximum Relevance while keeping Minimum Redundancy under a constrained search. Results on multiple benchmark datasets show that our proposed method improves accuracy as compared to other state-of-the-art feature selection algorithms while having a lower time complexity.

KW - Class-wise feature selection

KW - Classification

KW - Feature selection

KW - Information theory

UR - http://www.scopus.com/inward/record.url?scp=85132782709&partnerID=8YFLogxK

U2 - 10.1007/s13042-022-01589-5

DO - 10.1007/s13042-022-01589-5

M3 - Article

VL - 13

SP - 3211

EP - 3224

JO - International Journal of Machine Learning and Cybernetics

JF - International Journal of Machine Learning and Cybernetics

IS - 10

ER -

Constrained class-wise feature selection (CCFS)

Abstract

Keywords

Access to Document

Fingerprint

Cite this