In much of what follows we will be exploring the statistical significance of genes that are to some degree similar, and trying to interpret that similarity in an evolutionary context. But before that, one final word on entropy and the makeup of the average gene. Entropy in our context can be thought of as the tendency of a code over the course of several generations to mutate into an increasingly disordered state, given that we allow in each generation (whatever that means) a certain probability that each symbol will mutate.
On the next page is a simple entropy generator. There are 28 objects which can occur in two states, so it's equivalent to a binary sequence with 28 residues, which has 2^{28} = 268,435,456 possible states. Of these millions of states the vast majority are to a large degree disordered, lacking any meaningful pattern.


The initial state is very ordered: all the objects are in the same state. Pressing the button on the left allows one generation to pass, during which each object has one chance out of 20 (p=0.05) of mutating into the other state.
Neither state is ultimately prefered. The probability of each state mutating into the other during the course of a generation is the same for both states: p=0.05. The tendency is, therefore, for both states to eventually become roughly equally represented. Try it. In about 10 to 20 generations there should be nearly 14 of each state, and from that time onward it should stay like that, although the pattern will shift from one disordered state into another. You should be able to calculate how many states have 14 of each, or 15 of one and 13 of another, and so figure the probability of such states occuring randomly.
The case for amino acids is more complicated. The probability of any given codon mutating into any other codon will depend on the initial and final codons. And even should a mutation occur, it may reduce the survivability of the cell or entity in which it is housed, and so eventually die out. In the end, however, as is the case for the simple example on the next page, after sufficient generations have passed the amino acid codons will settle into percentage representations that should stay roughly constant thereafter (although it is quite conceivable that a change in environment would given rise to stresses that force an eventual change in the percentages).
