Application of Machine Learning Algorithms to Identify Recombination Spots
MetadataShow full item record
Meiotic recombination is a mechanism by which a cell promotes correct segregation of homologous chromosomes and repair of DNA damages. But it does not occur randomly across the whole genome. Relatively high frequencies meiotic recombination regions are identified as hotspots and relatively low frequencies meiotic recombination regions are identified as cold spots. But the accurate prediction of hot/cold spots is still an open challenge. Here, Recombination hotspots in a genome which are unevenly distributed. Again, hotspots are regions in a genome which show higher rates of meiotic recombination. Computational methods for recombination hotspot prediction often use sophisticated features which are derived from physio-chemical or structure-based properties of nucleotides. In this study we have taken a DNA data set. In this work, we have shown the uses of sequential based features which are computationally cheaper to generate. For this data set we used gapped k-mar composition. The data set which we have taken is a string data set. To do our work easier we have rearranged our string data set. Then we applied different algorithms on our data set to predict the result. It is also mentionable that we have tested our algorithm on standard benchmark dataset. Again, we also used 5-fold and 10-fold cross-validation in our dataset. Our analysis shows that compared to other methods, our work is able to produce significantly better results in terms of accuracy. For 5-fold cross-validation among all the algorithms SVM gives the best sensitivity and it is 0.7707. And, for 10-fold cross-validation, both LR and ANN gives best result of sensitivity and it is 0.7622. Here, the result of sensitivity for SVM is quite impressive and it is 0.7601.
- B.Sc Thesis/Project