The 20x20 scoring matrix resulting from this method would be a PAM-1 matrix, where
PAM = Percent of Accepted Mutations; 1 = one step. To get the PAM-250 scoring matrix we have to go back to
the kludged Mutation Probability Matrix, M, multiply it by itself 250 times to move 250 steps into the
future, then rederive a scoring matrix from that point (the number 250 was decided upon presumably by the
same cabal in the same smoke-filled room, when they weren't covering up their work on extraterrestials).
The result is a widely used scoring matrix, but ti is not the only choice.
The derivation of a scoring matirx given on the previous pages was fairly straightforward
and rigorous. Notice that qAA + qBB + qCC = 0.566, which is
the total probability of having no change, ie., that over one step any given residue will
remain stable. This number is fairly low. Over only one evolutionary step we should
expect a lot more stability than that.
It was arbitrarily decided in a smoke-filled room at a secret meeting of geneticists that
an acceptable one-step stability probability should be 0.99 (99 percent
stable, not mutating), and that this would define what one step meant. Having eradicated
all opposition in a series of daring night-time raids, they devised a mathematical kludge
that accomplished their goal. It's not a bad kludge, although it's not hard to imagine
special conditions in which it gives nonsense results. Still, we may safely assume that
such conditions are not present in the real world.
To complete this section let me say a brief word about Multiple Alignments.
Often times the task of geneticists is to take a single query sequence, pop it into a database,
find optimal pairwise scores for aligning the single sequence with each member of the database,
then pick out the scores above a certain cut-off and look at the corresponding optimal alignments.
Multiple Sequence Alignments
Sometimes, however, we have a collection of k > 2 sequences that we are pretty sure are all related,
and to better understand that relationship we want to multiply align all k members of the collection
all at once. There is an example of a multiple alignment, although perhaps not optimal, just above,
k = 4. The simplest, and arguably the most reasonable, way to score this is to sum all the pairwise
scores that appear in the multiple alignment, giving any gap-gap alignments a score of zero. (I'll
leave it to the reader to prove that there are 36 pairwise scores in the example above, with 5 of
those being gap-gap zeros.) Anyway, there are dynamic programming algorithms similar to those
already discussed, which will yield optimal alignments and scores for any collection of sequences.
We needn't go into details here.
We will discuss more advanced aspects of these subjects in a fifth section if funding can be found
to support the effort. Meanwhile, this concludes the alignment section. The fourth section covers
Head back to the Menu to get to page 1.