Question 1 Section A 1a
It may not be possible or practical to analyse an entire population, instead a sample of the population may
be used to predict or infer something about the population. Inferences may be point estimates which estimate a single parameter or interval estimates which represent a range of values likely to contain the parameter, known as confidence intervals. The width of the confidence interval gives us some idea about how uncertain we are about the unknown parameter. The narrower the range of the confidence interval the more accurate the estimate will be.
1b The central limit theorem, ‘whatever the distribution of the population, the mean of a large sample has an (approximate) normal distribution’. (Swift and Pitt p 498). The calculation of the confidence interval is therefore based on a standardised normal distribution with mean 0 and variance 1. Each sample mean is a single observation of a random variable designated as x . ¯ The mean value of the sample means is an unbiased estimator of the actual population mean μ and the standard deviation of the sample means is the equivalent of .
The 99% confidence interval is formed by the range which is 2.58 standard deviations either side of the mean. The stated confidence interval 82,636 to 87,364 is a range of 4,728 is the equivalent of 2.58 x 2 = 5.16 standard deviations where one standard deviation is
It is known that the population variance is 56,250,000 and therefore the population standard deviation σ is the square root of this number i.e. 7,500. Therefore, 4,728 = 5.16 . Solving for the unknown = (5.16 x 7500)/4728 = 8.185
Therefore n = 67 and the mean of the population is 82,636 + (4728/2) = 85,000.
1c (1) Hypothesis testing is used to determine the probability that a specified hypothesis is true. The assumption in this case that the mean salary of the population is 85,000, is called the null hypothesis (H0). The alternative hypothesis (H1) is a claim to be tested, which is that it is not. Although the mean salary from the sample is taken as 87,375 and so is greater than the assumed mean of the population, we would be equally interested to know if the actual mean salary is more or less than the mean estimated from the confidence interval. Therefore the null and alternative hypotheses may be written: H0 : μ = 85,000 H1 : μ ≠ 85,000 (2) As we are interested in discovering whether the mean salary is greater or less than the estimated this gives rise to a two-tail test. Two tailed testing is used when the region of rejection is on both sides of the sampling distribution. One tailed testing is used when you are only testing for the possibility of a relationship in one direction, either greater than (right hand test) or less than (left hand test).
(3) Once the null and alternative hypotheses have been defined, a level of significance at which the null hypothesis will be rejected is applied. At a significance level of 95% the null hypothesis will be rejected only if the observed data indicate a probability of 5% (100-95) or less that the null hypothesis could be true. The measured sample parameter, in this case the mean salary of the sample which is 87,375, is compared with the estimated population parameter of 85,000 and a z score, known as a test statistic, is calculated
which gives the number of standard deviations represented by the gap between the two figures. The calculation of the z-score is as follows: z = Thus, the calculation is: (87,375 – 85,000) / (7500/8.185) = 2.59 (4) Based on the information calculated the observed sample mean is 2.59 standard deviations away from the estimated population mean. At the 95% confidence level, the critical z-score is 1.96. Thus the calculated zscore far exceeds the critical level and we must conclude that the population mean salary is not 85,000. Therefore the decision would be made to reject the null hypothesis. 1d If the null hypothesis has been rejected at the 95%...
Please join StudyMode to read the full document