Fields of Specialization
- Structure and function of single-stranded nucleic acid
protein complexes
- Likelihood and Bayesian techniques in structural bioinformatics
- Adaptive evolution of molecular structures
Research Summary
Our lab studies the three-dimensional structures of macromolecular
complexes by integrating both experimental and bioinformatic
methods from the fields of X-ray crystallography, structural
bioinformatics, and evolutionary theory. Our previous research
has concentrated on the biophysical basis of sequence-specific
recognition of unusually structured nucleic acids (such
as ssDNAs and ssRNAs) and on the evolution of proteins involved
in this important biological function.
Telomeric OB-folds
The
OB-fold is one of the most important protein folds that
specifically interacts with single-stranded DNAs and RNAs,
and it is also one of the few protein superfolds. Multiple
OB-fold domains are common in nucleic acid recognition.
However, in general OB-fold domains are notoriously difficult
to detect based upon sequence similarity alone, and most
proteins containing this structural motif share little sequence
similarity. The OB-fold is found in several telomere-binding
proteins that specifically recognize and bind the single-stranded
DNA telomeric overhang of chromosomes, including the ciliate
telomere end-binding protein TEBP, the metazoan Pot1 proteins,
and budding yeast Cdc13 proteins. Telomere maintenance and
end-protection are essential for the survival and proliferation
of eukaryotic cells, suggesting that these proteins would
be highly conserved. In practice, however, evidence for
bona fide homology among telomeric factors has been elusive,
and, in the case of the known end-protection proteins, evolutionary
relationships have been postulated largely on the basis
of protein structural and functional similarity alone.
We have recently developed new bioinformatic methods for
macromolecular structural comparison and for exploration
of the distant evolutionary relationships among OB-fold
domains, especially its telomeric representatives. What
we've found is somewhat surprising: even though a billion
years of evolution have nearly erased any discernible sequence
similarities between individual OB-fold domains, distant
similarities can be gleaned from the noise when families
of domains are compared. And gratifyingly, these weak similarities
reliably classify the OB-fold domains according to known
cellular functions, as one would expect if these modern
domains had evolved from common ancestral domains.
Bayesian and likelihood methods for structural comparison
and analysis
Superpositioning
macromolecular structures is an essential tool in structural
bioinformatics and is used routinely in the fields of NMR,
X-ray crystallography, protein folding, molecular dynamics,
rational drug design, and structural evolution. Superpositioning
allows comparison of structures by fitting their atomic
coordinates to each other as closely as possible. Interpretation
of a superposition relies upon the accuracy of the estimated
orientations of the molecules, and thus reliable and robust
superpositioning tools are a critical component of structural
analysis and comparison.
The
structural superposition problem has classically been solved
with the standard statistical optimization method of least-squares
(LS). However, LS can provide misleading and inaccurate
results in theory and in practice. To correct for the shortcomings
of LS, we have applied likelihood and Bayesian techniques
to the superposition problem, resulting in much more accurate
superpositions and analyses of the complex correlations
among the atoms within macromolecules. For more information
see: http://www.theseus3d.org/.
Future Goals and Research
The lab's long-term scientific goals lie in developing
precise molecular understandings of the function of macromolecular
assemblies, an endeavor which ultimately must be informed
by evolutionary knowledge. Currently, the dominant paradigm
in structural biology is neutral evolutionary theory, which
assumes that the differences among homologous proteins are
unimportant for their functions. However, according to the
theory of natural selection, differences among proteins
can be important for function. Thus, for a full understanding
of the relationship between macromolecular function and
structure, we consider it essential to explicitly incorporate
the modern developments in population genetics regarding
natural selection. Conversely, structural knowledge can
also inform evolutionary inferences. Implementation of these
ideas requires rigorous bioinformatic techniques and modern
phylogenetic methods.
One ongoing research project involves "protein resurrection"
methods, in which multiple ancient and extinct proteins
are recreated in the lab, assayed experimentally for enzymatic
activity, and their atomic resolution structures determined
by crystallography. One of the goals of this research is
to create a movie in which we can watch how the three-dimensional
structure of a macromolecule has evolved in different lineages
via point mutations, with each change correlated with changes
in the molecule's biochemical function. These "structo-evo"
studies will shed light on important structure-function
questions, including possibilities for the rational design
of proteins with novel functions and for understanding how
changes in proteins can affect their function and structures.
Recent publications
Kang K, Pulver SR, Panzano VC, Chang EC, Griffith LC, Theobald DL and Garrity PA. "Analysis of Drosophila TRPA1 reveals an ancient origin for human chemical nociception." Nature (2010). (forthcoming)
Theobald DL. "Likelihood and empirical Bayes superpositions of multiple macromolecular structures." Bayesian methods in structural bioinformatics. Ed. Hamelryck T, Mardia KV, and Ferkinghoff-Borg J. New York: Springer Verlag, 2010 (forthcoming)
Liu P, Agrafiotis DK, and Theobald DL. "Fast determination of the optimal rotational matrix for macromolecular superpositions." Journal of Computational Chemistry early view, online in advance of print. (2010).
Theobald DL and Miller C. "Membrane transport proteins: Surprises in structural sameness.." Nature Structural & Molecular Biology 17. 1 (2010): 2-3.
Theobald DL. A nonisotropic Bayesian approach to superpositioning multiple macromolecules.. Proc. of the 28th Leeds Annual Statistical Research (LASR) Workshop, "Statistical Tools for Challenges in Bioinformatics". University of Leeds, UK: 2009.
Theobald DL and Wuttke DS. "Accurate structural correlations from maximum likelihood superpositions." PLoS Comput Biol 4. 2 (2008): e43.
Theobald DL, Darity WA. "Punctuated equilibrium." International Encyclopedia of the Social Sciences. Second Edition ed. 1 vols. 2007.
Theobald DL and Wuttke DS. "Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem." Proc Natl Acad Sci U S A 103. 49 (2006): 18521-7.
Theobald DL and Wuttke DS. "THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures." Bioinformatics 22. 17 (2006): 2171-2.
Theobald DL and Wuttke DS. "Divergent evolution within protein superfolds inferred from profile-based phylogenetics." J Mol Biol 354. 3 (2005): 722-37.
Theobald DL. "Rapid calculation of RMSDs using a quaternion-based characteristic polynomial." Acta Crystallogr A 61. Pt 4 (2005): 478-80.
Mitton-Fry RM, Anderson EM, Theobald DL, Glustrom LW, and Wuttke DS. "Structural basis for telomeric single-stranded DNA recognition by yeast Cdc13." J Mol Biol 338. 2 (2004): 241-55.
Theobald DL and Wuttke DS. "Prediction of multiple tandem OB-fold domains in telomere end-binding proteins Pot1 and Cdc13." Structure 12. 10 (2004): 1877-9.
Theobald DL and Schultz SC. "Nucleotide shuffling and ssDNA recognition in Oxytricha nova telomere end-binding protein complexes." Embo J 22. 16 (2003): 4314-24.
Theobald DL, Cervantes RB, Lundblad V, and Wuttke DS. "Homology among telomeric end-protection proteins." Structure 11. 9 (2003): 1049-50.
Theobald DL, Mitton-Fry RM, and Wuttke DS. "Nucleic acid recognition by OB-fold proteins." Annual Review of Biophysics and Biomolecular Structure 32. (2003): 115-33.
Last review: January 28,
2010.