Novel Feature Extraction for Predicting Gram-Positive and Gram-Negative Bacteria Protein Sub-cellular Localization
Abstract
Protein sub-cellular localization is defined as predicting the functioning location of a given protein in the cell. It is considered an important step towards protein function prediction and drug design. In this study, we used structural and evolutionary based features to represent the sequences of gram-positive and gram-negative bacteria protein dataset. In recent years all the works done in this sector are based on mainly PSSM features, they have not used SPIDER feature. PSSM gives us mainly evolutionary
features and SPIDER gives us mainly structural features. In our study, we have extracted features from both PSSM and SPIDER. We have extracted a lot of features (total 46 features using 10 categories for both dataset) from both PSSM and SPIDER and choosed best features (total 6 features among 46 features for both dataset) among them. Our achieved result shows that SPIDER feature along with PSSM feature gives best performance than previous result with gram-positive benchmark, but gives
a little low result for gram-negative benchmark. In this study, we have tried different kinds of classifier but among them Support Vector Machine (SVM) gives the best result.
Collections
- M.Sc Thesis/Project [145]