Sequence Q: 
A  B  C  C  C  A  B  A  A  C 
A  C  C  C  B  A  C  B  C  C 
C  A  A  B  B  B  A  A  B  B 
Sequence L: 
C  B  C  C  C  A  B  C  B  C 
B  B  C  A  B  C  B  B  C  C 
A  C  A  B  B  A  A  C  B  C 
Random Alignment Probabilities.
The two sequences above are assumed to be biological, and to be subject to biological constraints.
One place we see this is in the probabilities p_{A}=16/60, p_{B}=20/60 and p_{C}=24/60. In a
perfectly random world we might expect the A, B and C codons to occur with equal probability, which
would be 20/60 = 1/3. But in this imaginary world with its imaginary genes, the C codon is more useful
biologically than the A, so the A's occur less frequently than the C's.
There is yet another level at which biology reduces randomness, and that is in having a preference for some
mutations over others. Let's suppose we have a large pool of A, B and C codons with which we want to
build a 30 residue gene, and each time we draw a codon to fill a residue we have a chance of 16/60 that
the codon is A, 20/60 it's a B, and 24/60 it's a C. Then on average such genes will have the same
fraction of A's, B's and C's as the genes Q and L above. Now let's make another random
30 residue gene with the same codon probabilities, and align it with the first. Take a look at position 1.
What is the probability that it's an A aligned with an A? This probability, denote it P_{AA}, is
just the probability that the first sequence has an A in that position times the probability that the
second sequence has an A in that position, because the sequences are independent. The same
argument can be made for all the pairwise random alignment probabilities. This gives us a matrix of
random pairwise alignment probabilities:
P_{AA}=p_{A}p_{A}=0.07111  P_{AB}=p_{A}p_{B}=0.08889  P_{AC}=p_{A}p_{C}=0.10667 
Note that this Pmatrix is symmetric.

P_{BA}=p_{B}p_{A}=0.08889  P_{BB}=p_{B}p_{B}=0.11111  P_{BC}=p_{B}p_{C}=0.13333 
P_{CA}=p_{C}p_{A}=0.10667  P_{CB}=p_{C}p_{B}=0.13333  P_{CC}=p_{C}p_{C}=0.16 
Since these nine alignments are everything that can happen, the sum of these nine probabilities must
be 1 (NOTE: this is not a probability matrix of the kind developed on the previous pages; in that case
the sum of the probabilities down each column was 1).

NonRandom Alignment Probabilities.
This is the mutation probability matrix from the previous page:
6/16  3/20  7/24 
3/16  14/20  3/24 
7/16  3/20  14/24 
From this we will form a new matrix the components of which are defined below:
q_{AA}=M_{AA}p_{A}=(6/60)=0.1  q_{AA}=M_{AB}p_{B}=(3/60)=0.05  q_{AC}=M_{AC}p_{C}=(7/60)=0.11667 
q_{BA}=M_{BA}p_{A}=(3/60)=0.05  q_{BA}=M_{AB}p_{B}=(14/60)=0.23333  q_{BC}=M_{AC}p_{C}=(3/60)=0.05 
q_{CA}=M_{CA}p_{A}=(7/60)=0.11667  q_{CA}=M_{AB}p_{B}=(3/60)=0.05  q_{CC}=M_{AC}p_{C}=(14/60)=0.23333 
This matrix, like the Pmatrix, is symmetric, and also like the Pmatrix the sum of all nine components is 1.
Let's take a look at one of the components and make sense of it. For example,
q_{BA}=M_{BA}p_{A}
= [(# times A aligned with B)/(# times A occurs)] x [(# times A occurs)/(# of total codons)]
= (# times A aligned with B)/(# of total codons) = (# times B aligned with A)/(# of total codons)
= q_{AB}.
This would seem to be exactly the same as P_{BA}, that is, the probability of finding an
A aligned with a B. The difference is P_{BA} has no connection to the idea that some
alignments are biologically prefered; it's a random alignment probability. q_{BA} on the
other hand is defined in terms of M_{BA}, the value of which is determined by studying
a biological genome database (which in this case has two genes).

The Point.
Suppose q_{BA} > P_{BA}. That means the probability of finding an A as a result
of a mutation from a B occurs more frequently than we would expect given purely random mutations.
That is, Nature likes this idea, and if Nature likes it, then we had better score it positively,
despite the fact it's a mismatch.
Likewise, if q_{BA} < P_{BA}, then Nature dispproves. To keep her mollified we
choose to score this mismatch negatively. The way we achieve reasonable scores is to produce
from P and q a logodds matrix, which we'll do on the next page.

