Pattern Mining From Unlabeled News Article Dataset Using Semi-Supervised Learning

UIU Institutional Repository

    • Login
    View Item 
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • M.Sc Thesis/Project
    • View Item
    •   UIU DSpace Home
    • School of Science and Engineering (SoSE)
    • Department of Computer Science and Engineering (CSE)
    • M.Sc Thesis/Project
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Pattern Mining From Unlabeled News Article Dataset Using Semi-Supervised Learning

    Thumbnail
    View/Open
    PatternMiningFromUnlabeledNewsArticleDatasetUsingSemiSupervisedLearning.pdf (6.041Mb)
    Date
    2023-03-11
    Author
    Khandokar, Iftakhar Ali
    Metadata
    Show full item record
    Abstract
    Text classification is one of the prominent tasks in the field of Natural language Processing as day by day the amount of textual data is growing rapidly, Therefore it is an emergent demand to build some kind of knowledge model out of this growing data to extract the internal information out of the data samples due to the limitation of memory and computational power. One such example of this kind of rapidly growing data is the news articles produced daily by the vast amount of news publication platforms. Therefore in this work, we would like to introduce an automated approach to detect target events from these textual news articles and the type of events that are related to violent incidents where multi events labels will be detected within the news articles and extract several types of information from the news articles. To ensure the authenticity of the classification and event pattern analysis we adopted a semi-supervised approach where a small volume data predictive model is used to amplify the dataset eventually gathering enough data to feed it into the Deep net. To categorize the type of events are been detected are Murder, Rape, Kidnap, Clash, Suicide, and Teen Suicide where we experimented with multiple feature extraction techniques like N-gram, TF-IDF, Word-2-Vec, Fasttext, and BERT from which the BERT-based classifier achieved the highest accuracy of 92% in the provided test set. Utilizing the best-performing model we conducted Trend and Pattern analysis experiment on a five-year period of time series data which reveals some exciting insight information that is related to or affected by these violent events concerning various units of time.
    URI
    http://dspace.uiu.ac.bd/handle/52243/2732
    Collections
    • M.Sc Thesis/Project [151]

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Copyright 2003-2017 United International University
    Contact Us | Send Feedback
    Developed by UIU CITS