One of the most important and challenging problems in
computational biology is that of predicting the threedimensional
structure or shape of a protein from its amino acid sequence.
As a first step to tackling this problem, many researchers
have focused on the structural motif recognition problem:
given a known local threedimensional structure, or motif,
determine whether this motif occurs in a given amino acid
sequence.
In this talk, I will present algorithms that use probabilistic
techniques to improve existing methods for recognizing
protein structural motifs. These algorithms are particularly
effective at eliminating false positives found by previous
methods without introducing false negatives.
We have implemented these algorithms and have tested
them on two-stranded and three-stranded coiled coils.
The coiled coil motif has many important biological rotes;
for example, it is found in some DNA binding proteins
and plays a role in the membrane fusion of viruses such
as HIV.
Our algorithms have been codified in four programs: PairCoil
(1995), which uses pairwise correlations to significantly
improve upon existing methods for identifying two-stranded
coiled coils; MuttiCoil (1997), which uses multidimensional
clustering to identify and distinguish between three-stranded
and two-stranded coiled coils; LearnCoil-Histidine Kinase
(1998) and LearnCoil-VMP (1999), which incorporate statistical
learning techniques to identify histidine kinase linker
domains and viral membrane fusion proteins, respectively,
for which there are limited known solved structures.
Finally, I will talk about some of the biological implications
of our work. In particular, our programs have been useful
in identifying coiled-coil-like motifs in the envelope
proteins of many viruses, such as the influenza virus,
Moloney murine leukemia virus, HIV, SIV, and visna virus,
whose structures have since been solved. This in turn
has led to antiviral drug discovery by the Kim lab.
(Portions of this work are joint with Peter S. Kim, Ethan
Wolf, Mona Singh, David Wilson, and Andrea Cochran.)