Correlation Based Feature Selection with Clustering for Multi-Class Classification Tasks
Abstract
In recent times high dimensional data is increasing rapidly. Reduce the dimensionality has become popular by feature selection process. So many scientists prefer to use correlation base feature selection method for grouping the attributes of dataset. The main purpose of feature selection is to elect the most problem related features and to remove unnecessary (noisy and redundant) features. Many types of correlation base feature selection have been proposed. In previous mutual information, correlation coefficient, and chi-square has been used to find the dependency between two features. In this paper we merge two methods to find correlation of features. First we create covariance matrix using formula for each instances. So we get covariance matrix which will be dimensional as main dataset. Then we integrate Affinity Propagation (AP) clustering for grouping the features and took random features from each cluster and append them. So we make model of that subset and use general machine learning algorithms and find accuracy then compare with the main dataset accuracy.
Collections
- B.Sc Thesis/Project [82]