A Hybrid Approach for Feature Subset Selection to Classify High-Dimensional Data
Date
2019-03-27Author
Akter, Sanzida
Pritom, Hasib Rashid
Reza, Antara Anika
Suvo, Rakib Hasan
Metadata
Show full item recordAbstract
In this modern era, we need to deal with large dimension of data set every day. Be it on social media, business, medicine or gene expression there are thousands of millions data to be processed everyday. So dealing with big data or high dimensional data is very important. Feature selection is one of the best way to reduce the dimension of the large data down to a
point where we can handle them. In this paper we have proposed three approaches for handling high dimensional data. A hybrid approach for high dimensional data sets and two new manner to apply ReliefF and Symmetrical Uncertainty Attribute Evaluation. This new hybrid approach is a combination of filter and wrapper method and it was tested through intrinsic
experiments and proved to be quite efficient and good in terms of feature selection. Wrapper method is great for producing the best result but for high dimensional data sets the Wrapper method could take days. But with our approach it took few minutes as most of the features that are irrelevant or redundant were trimmed off. So this approach is less expensive than
wrapper but more efficient than filter. Also our manner of using ReliefF and Symmetrical Uncertainty shows us quite good results if we choose the right threshold and work through with proposed manner.
Collections
- B.Sc Thesis/Project [82]