Convolutional Neural Networks with Image Representation of Amino Acid Sequences for Protein Function Prediction
Abstract
Proteins are one of the most important molecules that govern the cellular processes in organisms. Various functions of the proteins are of paramount importance to understand the basics of life. Several supervised learning approaches are applied to this field to predict the functionality of proteins. In this thesis, we propose a convolutional neural network based approach protconv to predict the functionality of proteins by converting the amino-acid sequences to a two dimensional image. We have used a protein embedding technique using transfer learning to generate the feature vector. Feature vector is then converted into a square sized single channel image to be fed into a convolutional network. The neural network architecture used here is a combination of convolutional filters and average pooling layers followed by dense fully connected layers to predict a binary function. We have performed experiments on standard benchmark datasets taken from two very important protein function prediction task: proinflammatory cytokines and anticancer peptides. Our experiments shows that the proposed method, ProtConv achieves state-of-the-art performances on both of the datasets.
Collections
- M.Sc Thesis/Project [145]