Human Acetylcholinesterase complexed with the snake-venom toxin fasciculin-II - Kryger et al. Sussman (2000)


As protein-protein interaction is intrinsic to most cellular processes, the ability to predict which proteins in the cell interact can aid significantly in identifying the function of newly discovered proteins, and in understanding the molecular networks they participate in. Here we demonstrate that characteristic pairs of sequence-signatures can be learned from a database of experimentally determined interacting proteins, where one protein contains the one sequence-signature and its interacting partner contains the other sequence-signature. The sequence-signatures that recur in concert in various pairs of interacting proteins are termed correlated sequence-signatures, and it is proposed that they can be used for predicting putative pairs of interacting partners in the cell. We demonstrate the potential of this approach on a comprehensive database of experimentally determined pairs of interacting proteins in the yeast Saccharomyces cerevisiae. The proteins in this database have been characterized by their sequence-signatures, as defined by the InterPro classification. A statistical analysis performed on all possible combinations of sequence-signature pairs has identified those pairs that are over-represented in the database of yeast interacting proteins. It is demonstrated how the use of the correlated sequence-signatures as identifiers of interacting proteins can reduce significantly the search space, and enable directed experimental interaction screens.

Correlated sequence-signatures as markers of protein-protein interaction.
Sprinzak E, Margalit H. (2001). J Mol Biol., 311(4) 681-92.

How reliable are experimental protein-protein interaction data?
Sprinzak E, Sattath S, Margalit H. (2003). J Mol Biol., 327(5) 919-23.

  • Data files for correlated sequence-signatures analysis:

    Log-odds values of signature combinations
    Yeast Clusters
    Non-redundant database of interacting proteins

  • Data files for quality assessment of the protein-protein interaction data analysis:

    Data of interacting proteins with database, experimental method and calculated TP rate description
    Description of databases and experimental method codes

  • *ICI (Intra-Complex Interactions) and NICI (Non-Intra complex Interactions) data files:

    List of ICI
    List of NICI
    Attribute List

