Home > Faculty

theobald

Douglas Theobald, Ph.D.
Assistant Professor of Biochemistry

Evolution, structure, and function of macromolecular complexes

Ph,D., University of Colorado at Boulder

 

contact information

 

Fields of Specialization

  • Structure and function of single-stranded nucleic acid protein complexes
  • Likelihood and Bayesian techniques in structural bioinformatics
  • Adaptive evolution of molecular structures

Research Summary

Our lab studies the three-dimensional structures of macromolecular complexes by integrating both experimental and bioinformatic methods from the fields of X-ray crystallography, structural bioinformatics, and evolutionary theory. Our previous research has concentrated on the biophysical basis of sequence-specific recognition of unusually structured nucleic acids (such as ssDNAs and ssRNAs) and on the evolution of proteins involved in this important biological function.

Telomeric OB-folds

The OB-fold is one of the most important protein folds that specifically interacts with single-stranded DNAs and RNAs, and it is also one of the few protein superfolds. Multiple OB-fold domains are common in nucleic acid recognition. However, in general OB-fold domains are notoriously difficult to detect based upon sequence similarity alone, and most proteins containing this structural motif share little sequence similarity. The OB-fold is found in several telomere-binding proteins that specifically recognize and bind the single-stranded DNA telomeric overhang of chromosomes, including the ciliate telomere end-binding protein TEBP, the metazoan Pot1 proteins, and budding yeast Cdc13 proteins. Telomere maintenance and end-protection are essential for the survival and proliferation of eukaryotic cells, suggesting that these proteins would be highly conserved. In practice, however, evidence for bona fide homology among telomeric factors has been elusive, and, in the case of the known end-protection proteins, evolutionary relationships have been postulated largely on the basis of protein structural and functional similarity alone.

We have recently developed new bioinformatic methods for macromolecular structural comparison and for exploration of the distant evolutionary relationships among OB-fold domains, especially its telomeric representatives. What we've found is somewhat surprising: even though a billion years of evolution have nearly erased any discernible sequence similarities between individual OB-fold domains, distant similarities can be gleaned from the noise when families of domains are compared. And gratifyingly, these weak similarities reliably classify the OB-fold domains according to known cellular functions, as one would expect if these modern domains had evolved from common ancestral domains.

Bayesian and likelihood methods for structural comparison and analysis

Superpositioning macromolecular structures is an essential tool in structural bioinformatics and is used routinely in the fields of NMR, X-ray crystallography, protein folding, molecular dynamics, rational drug design, and structural evolution. Superpositioning allows comparison of structures by fitting their atomic coordinates to each other as closely as possible. Interpretation of a superposition relies upon the accuracy of the estimated orientations of the molecules, and thus reliable and robust superpositioning tools are a critical component of structural analysis and comparison.

The structural superposition problem has classically been solved with the standard statistical optimization method of least-squares (LS). However, LS can provide misleading and inaccurate results in theory and in practice. To correct for the shortcomings of LS, we have applied likelihood and Bayesian techniques to the superposition problem, resulting in much more accurate superpositions and analyses of the complex correlations among the atoms within macromolecules. For more information see: http://www.theseus3d.org/.

Future Goals and Research

The lab's long-term scientific goals lie in developing precise molecular understandings of the function of macromolecular assemblies, an endeavor which ultimately must be informed by evolutionary knowledge. Currently, the dominant paradigm in structural biology is neutral evolutionary theory, which assumes that the differences among homologous proteins are unimportant for their functions. However, according to the theory of natural selection, differences among proteins can be important for function. Thus, for a full understanding of the relationship between macromolecular function and structure, we consider it essential to explicitly incorporate the modern developments in population genetics regarding natural selection. Conversely, structural knowledge can also inform evolutionary inferences. Implementation of these ideas requires rigorous bioinformatic techniques and modern phylogenetic methods.

One ongoing research project involves "protein resurrection" methods, in which multiple ancient and extinct proteins are recreated in the lab, assayed experimentally for enzymatic activity, and their atomic resolution structures determined by crystallography. One of the goals of this research is to create a movie in which we can watch how the three-dimensional structure of a macromolecule has evolved in different lineages via point mutations, with each change correlated with changes in the molecule's biochemical function. These "structo-evo" studies will shed light on important structure-function questions, including possibilities for the rational design of proteins with novel functions and for understanding how changes in proteins can affect their function and structures.

Recent publications

Douglas L. Theobald and Deborah S. Wuttke (2008) "Accurate structural correlations from maximum likelihood superpositions."
PLOS Computational Biology 4(2):e43

Douglas L. Theobald (2007) "Punctuated equilibrium." Forthcoming in International Encyclopedia of the Social Sciences, 2nd Edition.

Douglas L. Theobald and Deborah S. Wuttke (2006) "Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem." Proceedings of the National Academy of Sciences USA 103(49): 18521-18527.

Douglas L. Theobald and Deborah S. Wuttke (2006) "THESEUS: Maximum likelihood superpositioning and analysis of macromolecular structures." Bioinformatics 22(17): 2171-2172.

Douglas L. Theobald and Deborah S. Wuttke (2005) "Divergent evolution within protein superfolds inferred from profile-based phylogenetics." Journal of Molecular Biology 354(3): 722-737.

Douglas L. Theobald (2005) "Rapid calculation of RMSD using a quaternion-based characteristic polynomial." Acta Crystallographica A 61(Pt 4): 478-480.

Douglas L. Theobald and Deborah S. Wuttke (2004) "Prediction of multiple tandem OB-fold domains in telomere end-binding proteins Pot1 and Cdc13." Structure (Cambridge) 12(10): 1877-1879.

Rachel M. Mitton-Fry, Emily M. Anderson, Douglas L. Theobald, Leslie W. Glustrom, and Deborah S. Wuttke (2004) "Structural basis for telomeric single-stranded DNA recognition by yeast Cdc13." Journal of Molecular Biology 338(2): 241-255.

Douglas L. Theobald, Rachel B. Cervantes, Victoria Lundblad, and Deborah S. Wuttke (2003) "Homology among telomeric end-protection proteins." Structure (Cambridge) 11(9): 1049-1050.

Douglas L. Theobald, Rachel M. Mitton-Fry, and Deborah S. Wuttke (2003) "Nucleic acid recognition by OB-fold proteins." Annual Review of Biophysical and Biomolecular Structure. 32: 115-133.


Last review: February 28, 2007. E-mail comments or questions to the webmaster.


Top of Page
| Life Science Faculty | Home