In a perfect world every symbol in a code alphabet occurs with equal probability. In this way the information content is maximized. Suppose we have the two symbol alphabet {0,1}. Each residue is one bit, but only if each symbol occurs with a probability of p=0.5 (so the total probability for the two symbols is 0.5 + 0.5 = 1). If the symbols occur with unequal probability each residue is actually worth less than one bit. For example, suppose the probability that the symbol 0 will occur in a given residue is p0 = 1, and the probability that the symbol 1 will occur in the residue is p1 = 0. That is, 1 has no chance of occuring and 0 will definitely occur (again, the total probability is p0 + p1 =1). This leaves us with one viable symbol, but one symbol can't answer any questions (let's face it, the sound of one hand clapping is silence, unless you do that funny thing of bending the hand down to flap the finger tips against the palm and...). There's no information in a residue if it is necessarily filled with a given symbol. So even though our alphabet has two symbols, the information per residue in this case is zero bits. So, equal probabilities yield one bit/residue, the maximum in this case; probabilities of 0 and 1 yield zero bits/residue. What if p0 = 0.8 and p1 = 0.2? In that case the number of bits/residue is something between one and zero. In particular, it's

-[0.8 log2(0.8) + 0.2 log2(0.2)] = 0.722.

What's that mean? Well, it just means that if someone sends you a coded message using one residue, two symbols, and probabilities 0.8 and 0.2, the message is a little less reliable than if the probabilities had been 0.5 and 0.5.

NOTE! Bits/residue is the number of bits associated with each residue (ie., digit, ie., position). If you have a code of length 150 (150 residues), multiply by 150 to get the total number of bits for the whole thing.