Title

A Computational Framework For Predicting Direct Contacts and Substructures Within Protein Complexes.

Funding Source

National Institutes of Health

Grant Number

5P20GM103424-17,2U54MD007595

Department

Department of Physics and Computer Science - Dual Degree Engineering

Document Type

Article

Publication Date

10-25-2019

Abstract

Understanding the physical arrangement of subunits within protein complexes potentially provides valuable clues about how the subunits work together and how the complexes function. The majority of recent research focuses on identifying protein complexes as a whole and seldom studies the inner structures within complexes. In this study, we propose a computational framework to predict direct contacts and substructures within protein complexes. In this framework, we first train a supervised learning model of l2-regularized logistic regression to learn the patterns of direct and indirect interactions within complexes, from where physical subunit interaction networks are predicted. Then, to infer substructures within complexes, we apply a graph clustering method (i.e., maximum modularity clustering (MMC)) and a gene ontology (GO) semantic similarity based functional clustering on partially-and fully-connected networks, respectively. Computational results show that the proposed framework achieves fairly good performance of cross validation and independent test in terms of detecting direct contacts between subunits. Functional analyses further demonstrate the rationality of partitioning the subunits into substructures via the MMC algorithm and functional clustering.

Comments

DOI: 10.3390/biom9110656

PubMed ID: 31717703

Share

COinS