Towards Comprehensive Bangla Computing: Corpus and Lexicon with Spell and Grammar Checker

Hossain, Nahid

dc.contributor.author	Hossain, Nahid
dc.date.accessioned	2019-02-06T08:42:33Z
dc.date.available	2019-02-06T08:42:33Z
dc.date.issued	2019-02-06
dc.identifier.uri	http://dspace.uiu.ac.bd/handle/52243/760
dc.description.abstract	In this thesis, we have presented a comprehensive Bangla spell and grammar checker and its building techniques. To make the grammar checker highly accurate and robust, we have built the largest Bangla monolingual corpus comprising over 100 million words. Moreover, we have built the largest Bangla lexicon with over 1 million unique words extracted from the monolingual corpus to enrich the spell checker. Since we have embedded immense data, to increase efficiency and to reduce processing time we have used hashing technique, pre-defined double metaphone and pre-defined counts for language model probability. In addition, our spell and grammar checker improves itself day by day by keeping an individual local log of an user’s previous suggestions and choices for future suggestions which gives customized user experiences. Bangla is a language spoken of over 300 million total speakers around the world. To write a fruitful Bangla article for diverse publications, it is highly required to have a robust spell and grammar checker. However, few pieces of research have been done on this side of language processing and therefore, in Bangla, there is no such spell and grammar checker which can provide highly appreciable output. Some studies have been done individually on the spell checker or grammar checker. However, checking spell and checking grammar at the same time is very essential for a novel article. Moreover, no researches have been done with such an immense amount of data like we did. In addition, all of these studies have been done only for research purposes without discerning practical applications. That is why all of these studies show several imperfections in performing real-life text processing. In this thesis, we have demonstrated the technique to build our corpus, lexicon, spell and grammar checker with describing the limitations of other studies and solution of those limitations.	en_US
dc.language.iso	en_US	en_US
dc.publisher	United International University	en_US
dc.subject	Bangla spell and grammar checker	en_US
dc.subject	Bangla monolingual corpus	en_US
dc.subject	Bangla lexicon	en_US
dc.subject	double metaphone	en_US
dc.title	Towards Comprehensive Bangla Computing: Corpus and Lexicon with Spell and Grammar Checker	en_US
dc.type	Thesis	en_US

Files in this item

Name:: Nahid Thesis Report.pdf
Size:: 16.45Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

M.Sc Thesis/Project [167]

Show simple item record