BANGLA ASPECT-BASED SENTIMENT ANALYSIS BY SUPERVISED LEARNING BASED ON ASPECT TERM EXTRACTION

An Naim, Forhad

View/Open

Forhad An Naim_012162011.pdf (561.8Kb)

Date

2020-10-20

Author

An Naim, Forhad

Metadata

Show full item record

Abstract

Sentiment Analysis is the process of retrieving human sentiments from the text. Aspect based sentiment analysis goes one step ahead than sentiment analysis by automatically assigning sentiments to specific aspect terms. It is a text analysis technique that extracts and separates each aspect term and identifies the sentiment polarity associated with each aspect term. Bangla is the seventh most native language in the world. Almost 230 million people are spoken in Bangla. So, sentiment analysis in the Bangla language is considered as crucial and well-timed research topics. Recently, aspect-based sentiment analysis is progressing because of identifying fine-grained opinion polarity associated with specific aspect terms. But due to lack of proper resources like annotated dataset, corpora, etc. Aspect-based sentiment analysis is a complicated task. In this thesis, we have used two publicly available datasets named cricket and restaurant. To perform aspect category extraction known as one of ASBS's task, we have conducted our experiments based on two recent studies from 2018 and 2020. In those studies, researchers used some conventional supervised learning algorithms as 2018 [SVM, RF, KNN] and 2020 [SVM, RF, KNN, NB, LR]. In our work, after pre-processing the dataset, we applied a new technique named PSPWA (Priority Sentence Part Weight Assignment) on the dataset. After that, we used a few conventional supervised learning algorithms (SVM, KNN, RF, LR, and NB) to demonstrate results. Whereas our dataset is imbalanced, we considered F1-score as a performance measure factor. Then we compared our results with the previous research works on the same dataset. In the cricket dataset, SVM, KNN, LR, NB performed better than two existing works during the experiment and resulting F1 score of 46%, 31%, 43%, 32%. In the restaurant dataset, SVM, LR, NB performed better than two existing works during the experiment and resulting in an F1 score of 49%, 44%, 34%. For both cricket and restaurant dataset, SVM achieved the best F1 score between all algorithms and scored 46% and 49% respectively.

URI

http://dspace.uiu.ac.bd/handle/52243/1897

Collections

M.Sc Thesis/Project [154]