Scalable Decision Tree Induction For Mining Big Data

UIU Institutional Repository

    • Login
    View Item 
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • B.Sc Thesis/Project
    • View Item
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • B.Sc Thesis/Project
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Scalable Decision Tree Induction For Mining Big Data

    Thumbnail
    View/Open
    Scalable Decision Tree Induction For Mining Big Data (2).pdf (3.772Mb)
    Date
    2019-05-22
    Author
    Sabah, Shabnam
    Anwar, Sara Zumerrah Binte
    Afroze, Sadia
    Sarker, Snigdha
    Metadata
    Show full item record
    Abstract
    Big data mining is one of the major challenging research issues in the field of machine learning for data mining applications in this present digital era. Big data consists of 3V's: (1) volume - massive amount of data/ too many bytes, (2) velocity - high speed streaming data/ too high a rate, and (3) variety - data are coming from different sources/ too many sources. Collecting and managing real-life big data is a difficult task, as big data is so big that we cannot keep all the data together in a single machine. Therefore, we need advanced relational database management systems with parallel computing to deal with big data. Knowledge mining from big data employing traditional machine learning and data mining techniques is a big issue and attract computational intelligent researcher in this area. In this paper, we have used the decision tree (DT) induction method for mining big data. Decision tree induction is one of the most preferable and well-known supervised learning technique, which is a top-down recursive divide and conquer algorithm and require little prior knowledge for constructing a classifier. The traditional DT algorithms like Iterative Dichotomiser 3 (ID3), C4.5 (a successor of ID3 algorithm), Classification and Regression Trees (CART) are generally built for mining relatively small datasets. So, we need a more scalable decision tree learning approach for mining big data. In this paper, we have engendered several trees employing two scalable decision tree algorithms: RainForest Tree and Bootstrapped Optimistic Algorithm for Tree construction (BOAT) using seven benchmark datasets from Keel Repository and UCI Machine Learning repository. We have compared the performance of RainForest and BOAT algorithms. Also, we have proposed a decision tree merging approach, as decision tree merging is a very complex and challenging task.
    URI
    http://dspace.uiu.ac.bd/handle/52243/1117
    Collections
    • B.Sc Thesis/Project [82]

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS