Consider the following question: someone takes a sample from a population and finds both the sample mean and the sample standard deviation. What can he learn from this sample mean about the population mean? This is an important problem and is addressed by the Central Limit Theorem. For now, let us not bother about what this theorem states but we will look at how it could help us in answering our question.
The Central Limit Theorem tells us that if we take very many samples the means of all these samples will lie in an interval around the population mean. Some sample means will be larger than the population mean, some will be smaller. The Central Limit Theorem goes on to state that 95% of the sample means will lie in a certain interval around the population mean. That interval is called the 95% confidence interval. Practically spoken it means that whenever someone is taking a sample and calculates the mean of that sample, he can be 95% confident that the mean of the sample he just took is in the 95% confidence interval. More importantly, if someone takes a sample from a population and calculates the mean of that sample, he can be 95% confident that the population mean is also in the 95% confidence interval. Thus, the sample mean gives us an approximation of the population mean. The same holds true for a 90%, a 99%, or for that matter any percentage confidence interval. Depending on the situation we are in, we can easily calculate these intervals. There are three different situations which we will study, but let us first look at the general idea of a confidence interval.
The General Idea of Confidence Intervals
Suppose that we have a population which is normally distributed. The population mean, usually denoted by μ, will thus be at the peak of the distribution. Assume that we plot the sample means on the horizontal axis. The 95% confidence interval is that interval in which 95% of the sample means will be in. Since the normal distribution is perfectly symmetrical around the population mean the confidence interval will be centered around the population mean. Thus we also know that 47.5% of all sample means are to the left of the population mean in the confidence interval and 47.5% of them are to the right of the population mean. Consequently we know that 2.5% of all sample means are in the left tail of the diagram below. Due to the fact that the Cumulative Standard Normal Distribution is perfectly symmetrical an equal 2.5% of all the sample means are to be found in the right tail. This is illustrated in the following diagram.
The question now is how to find the z-value from the Cumulative Standard Normal Distribution Table. This Cumulative Standard Normal Distribution Table is given, in two parts, below. The first part covers z-values from -3.40 to -0.09, the second part deals with z-values from 0.00 to 3.49. [pic]
This table does not deal with percentages but with decimals. So the 2.5% appears as .025. Please note that the tip of the arrow with 95% at its bottom points to .0250, which gives us a z-value of -1.96. Mark the tip of this arrow and then go to the left in that same row to read the z-value (-1.9) from the far left z-column. The second digit after the decimal point is found by going from the tip of the arrow to the top row. In this way you get -1.96. For a 90% confidence interval we again consult the table above for a probability value of .05 (45% on each side of the population mean so 5% in the left tail) and in the same way as before we find zc = -1.65. Since we cannot find 0.05 exactly, just values close to it, we will agree with the general consensus to take the z value of -1.65. A second arrow with 90% at its bottom, points to this value. For the 99% confidence interval, we look up a probability of .005 (49.5% which is half of 99% on each side of the population mean, gives 0.5% in the left tail) and find zc = -2.58. A third arrow with 99%...