Standard Deviation.

As already mentioned, the standard deviation arises out of the study of the Normal distribution, and it is ideally suited to measure the spread of that distribution. The bigger the standard deviation, the wider and lower the distribution. This is true in general for all the distributions we will look at, but establishing a connection between the standard deviation and probabilities is more difficult with most other distributions, especially if they are not symmetric. In that case the connection is left/right side dependent.

The other problem we face is differentiating between the population standard deviation and a sample standard deviation. Even when we know the kind of distribution a given type of data should fall in, we usually will not know how spread the whole population of data would be if we could have the whole population. And usually we can't, so there's no way to calculate an exact standard deviation. We usually have to settle for a sample of data values, and from these we can get only an approximation to the standard deviation, and for that matter, only an approximation to the mean, too. We differentiate between population means/standard deviations and sample means/standard deviations by giving them different notations. The population mean is denoted m (if you're not seeing the Greek letter mu, then get a better browser), while the sample

mean is denoted
_
X.

The population standard deviation is
denoted s, while the sample standard deviation is denoted s. The population standard deviation is theoretical; the sample standard deviation in determined experimentally. Given n sample data values, x1,...,xn, the associated standard deviation s is given by the formula:

Let's discuss this a bit. The sample standard deviation "s"
is a measure of how spread out the data is, and the more
spread the data, the less the certainty associated with the
data. That is, the probability that a new data value will
fall within a given interval about the mean decreases as
the standard deviation increases. Note that if n = 1, which
means we have only one data value in our sample, then we
should have no idea at all how spread the population
distribution would be. And sure enough, in that case,
since we're dividing by (n-1), the sample standard
deviation is undefined. As n increases (more and
more data values from which to make inferences),
s decreases, approaching the population standard
deviation as n goes to infinity.

For the most part we will be interested in
probabilities associated with being out at the upper
end of a probability distribution. On the next page
is a little Java program associating probabilities
with the right tails of Normal distributions (you'll
see what this means when you get there).