Prediction of Protein methylation sites of lysine residues using machine learning algorithms
Abstract
Post Translational Modification (PTM) plays an essential role in the biological and molecular mechanisms. They are also considered as a vital element in cell signaling and networking pathways. Among different PTMs, Methylation is regarded as one of the essential types. Methylation plays a crucial role in maintaining the dynamic balance, stability, and remodeling of chromatins. Methylation also leads to different abnormalities in cells and is responsible for many serious diseases. Methylation can be detected by experimental approaches such as methylation-specific antibodies, mass spectrometry, or characterizing methylation sites using the radioactive labeling method. However, these practical approaches are time-consuming and costly. Therefore, there is a demand for fast and accurate computational techniques to focus on these issues. This study proposes a machine learning based approach called LyMethySE to predict methylation sites in proteins. To build this model, we use an evolutionary-based bi-gram profile combined with predicted structural approach to extract features. To our best knowledge, no method has been used to predict the methylation site of lysine residues using combination information as feature extraction technique. Incorporating profile bigram also leads LyMethySE to keep the features size limited for different evolutionary information window size. We apply mostly used eight different classifiers from literature as predictor to evaluate our feature extraction technique. Among them, Support Vector Machine (SVM) outperforms the result. Therefore, we use SVM as our base classification technique to build LyMethySE. This study also shows the impact and comparative analysis of different base classifiers for our extracted features.
Collections
- M.Sc Thesis/Project [145]