A Computational Framework For Predicting Direct Contacts and Substructures Within Protein Complexes.
National Institutes of Health
Department of Physics and Computer Science - Dual Degree Engineering
Understanding the physical arrangement of subunits within protein complexes potentially provides valuable clues about how the subunits work together and how the complexes function. The majority of recent research focuses on identifying protein complexes as a whole and seldom studies the inner structures within complexes. In this study, we propose a computational framework to predict direct contacts and substructures within protein complexes. In this framework, we first train a supervised learning model of l2-regularized logistic regression to learn the patterns of direct and indirect interactions within complexes, from where physical subunit interaction networks are predicted. Then, to infer substructures within complexes, we apply a graph clustering method (i.e., maximum modularity clustering (MMC)) and a gene ontology (GO) semantic similarity based functional clustering on partially-and fully-connected networks, respectively. Computational results show that the proposed framework achieves fairly good performance of cross validation and independent test in terms of detecting direct contacts between subunits. Functional analyses further demonstrate the rationality of partitioning the subunits into substructures via the MMC algorithm and functional clustering.
Mei, S. and Zhang, Kun, "A Computational Framework For Predicting Direct Contacts and Substructures Within Protein Complexes." (2019). Faculty and Staff Publications. 80.