
The genome revolution has produced an abundance of protein sequence data. Traditional homology-based computer methods make it possible to establish evolutionary relationships between large numbers of these proteins. Yet among any set of new protein sequences, say from the complete genome sequence of a new organism, a significant fraction of the proteins cannot be assigned fuctions by traditional methods. A new sequence may have no recognizable homologs in other organisms, or it may have recognizable homologs, the cellular functions of which are yet unknown. The critical need to make some kind of functional inferences for the vast numbers of proteins that could not be functionally annoted by traditional homology methods led in 1999, and in the years that followed, to new ideas for inferring ‘functional linkages’ between proteins not related to each other by homology. These ‘non-homology’ or ‘genomic context’ methods included the ‘Phylogenetic Profile’ method and the ‘Rosetta-Stone’ method (both pioneered principally by Edward Marcotte and Matteo Pellegrini when they were postdocs with Eisenberg and Yeates), and others. Subsequent work has aimed to extend those ideas. One recent extension of Phylogenetic Profiles (developed by Peter Bowers and Shawn Cokus) involves an application of logic analysis to uncover proteins whose presence vs. absence across organisms is related to the presence or absence of two other proteins, taken in logical combination. These kinds of higher order relationships are expected to be abundant in the cell, but are not detected by the original Phylogenetic Profile method, which looks for direct similarity between the profiles of just two proteins at a time.

Our computational genomics work has touched on many other subjects as well: disulfide bonding in thermophiles, repetitive protein sequences, genomic encoding of unusual amino acids such as selenocysteine and pyrollysine, detection of protein targeting sequences, and the function of bacterial microcompartments.
References:
2011
Jorda J, Yeates TO Widespread disulfide bonding in proteins from thermophilic archaea. Archaea. 2011. 2011:409156. 2011 PMID: 21941460 PMC3177088 10.1155/2011/409156 |
2010
Fan C, Cheng S, Liu Y, Escobar CM, Crowley CS, Jefferson RE, Yeates TO, Bobik TA Short N-terminal sequences package proteins into bacterial microcompartments. Proc. Natl. Acad. Sci. U.S.A.. Apr 2010. 107(16):7509-14. 2010 PMID: 20308536 PMC2867708 10.1073/pnas.0913199107 |
2009
Beeby M, Bobik TA, Yeates TO Exploiting genomic patterns to discover new supramolecular protein assemblies. Protein Sci.. Jan 2009. 18(1):69-79. 2009 PMID: 19177352 PMC2708037 10.1002/pro.1 |
Sprinzak E, Cokus SJ, Yeates TO, Eisenberg D, Pellegrini M Detecting coordinated regulation of multi-protein complexes using logic analysis of gene expression. BMC Syst Biol. 2009. 3:115. 2009 PMID: 20003439 PMC2804736 10.1186/1752-0509-3-115 |
2005
Bowers PM, O'Connor BD, Cokus SJ, Sprinzak E, Yeates TO, Eisenberg D Utilizing logical relationships in genomic data to decipher cellular processes. FEBS J.. Oct 2005. 272(20):5110-8. 2005 PMID: 16218945 10.1111/j.1742-4658.2005.04946.x |
Beeby M, O'Connor BD, Ryttersgaard C, Boutz DR, Perry LJ, Yeates TO The genomics of disulfide bonding and protein stabilization in thermophiles. PLoS Biol.. Sep 2005. 3(9):e309. 2005 PMID: 16111437 PMC1188242 10.1371/journal.pbio.0030309 |
Chaudhuri BN, Yeates TO A computational method to predict genetically encoded rare amino acids in proteins. Genome Biol.. 2005. 6(9):R79. 2005 PMID: 16168086 PMC1242214 10.1186/gb-2005-6-9-r79 |
2004
Bowers PM, Cokus SJ, Eisenberg D, Yeates TO Use of logic relationships to decipher protein network organization. Science. Dec 2004. 306(5705):2246-9. 2004 PMID: 15618515 10.1126/science.1103330 |
O'Connor BD, Yeates TO GDAP: a web tool for genome-wide protein disulfide bond prediction. Nucleic Acids Res.. Jul 2004. 32(Web Server issue):W360-4. 2004 PMID: 15215411 PMC441514 10.1093/nar/gkh376 |
Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol.. 2004. 5(5):R35. 2004 PMID: 15128449 PMC416471 10.1186/gb-2004-5-5-r35 |
2003
Strong M, Graeber TG, Beeby M, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D Visualization and interpretation of protein networks in Mycobacterium tuberculosis based on hierarchical clustering of genome-wide functional linkage maps. Nucleic Acids Res.. Dec 2003. 31(24):7099-109. 2003 PMID: 14654685 PMC291866 |
2002
Mallick P, Boutz DR, Eisenberg D, Yeates TO Genomic evidence that the intracellular proteins of archaeal microbes contain disulfide bonds. Proc. Natl. Acad. Sci. U.S.A.. Jul 2002. 99(15):9679-84. 2002 PMID: 12107280 PMC124975 10.1073/pnas.142310499 |
2000
Eisenberg D, Marcotte EM, Xenarios I, Yeates TO Protein function in the post-genomic era. Nature. Jun 2000. 405(6788):823-6. 2000 PMID: 10866208 10.1038/35015694 |
1999
Pellegrini M, Yeates TO Searching for frameshift evolutionary relationships between protein sequence families. Proteins. Nov 1999. 37(2):278-83. 1999 PMID: 10584072 |
Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D A combined algorithm for genome-wide prediction of protein function. Nature. Nov 1999. 402(6757):83-6. 1999 PMID: 10573421 10.1038/47048 |
Marcotte EM, Pellegrini M, Yeates TO, Eisenberg D A census of protein repeats. J. Mol. Biol.. Oct 1999. 293(1):151-60. 1999 PMID: 10512723 10.1006/jmbi.1999.3136 |
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D Detecting protein function and protein-protein interactions from genome sequences. Science. Jul 1999. 285(5428):751-3. 1999 PMID: 10427000 |
Pellegrini M, Marcotte EM, Yeates TO A fast algorithm for genome-wide analysis of proteins with repeated sequences. Proteins. Jun 1999. 35(4):440-6. 1999 PMID: 10382671 |
Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U.S.A.. Apr 1999. 96(8):4285-8. 1999 PMID: 10200254 PMC16324 |