Loading…
A new feature selection method based on frequent and associated itemsets for text classification
SummaryFeature selection is one of the major issues in pattern recognition. The quality of selected features is important for classification as the low‐quality data can degrade the model construction performance. Due to the difficulty of dealing with the problem that selected features always contain...
Saved in:
Published in: | Concurrency and computation 2022-11, Vol.34 (25), p.n/a |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | SummaryFeature selection is one of the major issues in pattern recognition. The quality of selected features is important for classification as the low‐quality data can degrade the model construction performance. Due to the difficulty of dealing with the problem that selected features always contain redundant information, this article focuses on the association analysis theory in data mining to select important features. In this study, a novel feature selection method based on frequent and associated itemsets (FS‐FAI) for text classification is proposed. FS‐FAI seeks to find relevant features and also takes feature interaction into account. Moreover, it uses association as a metric to evaluate the relativity between the target concept and feature(s). To evaluate the efficacy of the proposed method, several experiments were conducted on a BBC dataset from the BBC news website and SMS spam collection dataset from the UCI machine learning repository. The obtained results were compared to well‐known feature selection methods. The reported results demonstrated the effectiveness of the proposed feature selection method in selecting high‐quality features and in handling redundant information in text classification. |
---|---|
ISSN: | 1532-0626 1532-0634 |
DOI: | 10.1002/cpe.7258 |