A Novel Approach to Predict the Origin of Replication
Abstract
In the genome of every species, there exists an origin, known as the origin of replication (ORI), from where the genome starts to replicate itself during the process of cell division. Finding out this origin; is therefore a very prime and demanding problem in bioinformatics research, as this is the main responsible key-factor for the replication process of DNA. In this study, we start off by choosing a benchmark dataset of a yeast named Saccharomyces cerevisiae, generate simple and inexpensive sequence based features, label and prepare the features for computation, feed them to 10 basic machine learning algorithms, compare the results, and finally propose a novel approach, to help predict the Origin of Replication by achieving 98.15% of accuracy by implementing Logistic Regression classifier with 10 fold cross validation. Here in this study, we also represent a comparison table containing the results for all 10 experimented classifiers, to showcase the clear distinction and success of our proposed approach, from that of others.
Collections
- M.Sc Thesis/Project [145]