Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks
Date
2014-03-31Author
Rahman, Chowdhury Mofizur
Farid, Dewan Md
Hossain, M Alamgir
Strachan, Rebecca
Metadata
Show full item recordAbstract
In this paper, we introduce two independent hybrid mining algorithms to improve the classification
accuracy rates of decision tree (DT) and naïve Bayes (NB) classifiers for the classification of multi-class
problems. Both DT and NB classifiers are useful, efficient and commonly used for solving classification
problems in data mining. Since the presence of noisy contradictory instances in the training set may
cause the generated decision tree suffers from overfitting and its accuracy may decrease, in our first
proposed hybrid DT algorithm, we employ a naïve Bayes (NB) classifier to remove the noisy troublesome
instances from the training set before the DT induction. Moreover, it is extremely computationally expensive
for a NB classifier to compute class conditional independence for a dataset with high dimensional
attributes. Thus, in the second proposed hybrid NB classifier, we employ a DT induction to select a
comparatively more important subset of attributes for the production of naïve assumption of class conditional
independence. We tested the performances of the two proposed hybrid algorithms against those
of the existing DT and NB classifiers respectively using the classification accuracy, precision, sensitivityspecificity
analysis, and 10-fold cross validation on 10 real benchmark datasets from UCI (University of
California, Irvine) machine learning repository. The experimental results indicate that the proposed
methods have produced impressive results in the classification of real life challenging multi-class problems.
They are also able to automatically extract the most valuable training datasets and identify the
most effective attributes for the description of instances from noisy complex training databases with
large dimensions of attributes.
Collections
- Faculty Publications [11]