Testing statistical significance is an excellent way to identify probably relevance between a total data set mean/sigma and a smaller sample data set mean/sigma, otherwise known as a population mean/sigma and sample data set mean/sigma. This classification of testing is also very useful in proving probable relevance between data samples. Although testing statistical significance is not a 100% fool proof, if testing to the 95% probability on two data sets the statistical probability is .25% chance that the results of the two samplings was due to chance. When testing at this level of probability and with a data set size that is big enough, a level of certainty can be created to help determine if further investigation is warranted. The following is a problem is used to illustrate how testing statistical significance paints a more descriptive picture of data set relationships. Sam Sleep researcher hypothesizes that people who are allowed to sleep for only four hours will score significantly lower than people who are allowed to sleep for eight hours on a management ability test. He brings sixteen participants into his sleep lab and randomly assigns them to one of two groups. In one group he has participants sleep for eight hours and in the other group he has them sleep for four. The next morning he administers the SMAT (Sam's Management Ability Test) to all participants. (Scores on the SMAT range from 1-9 with high scores representing better performance). Is Sam's hypothesis supported by this data? SMAT scores
8 hours sleep group (X)
4 hours sleep group (Y)
When given a data set one of the most important evaluations is to determine if the data set size is big enough to show relevance. So, the first thing I did was to check if the size warranted further review. Finding the smallest relevant size of data is as simple as taking the confidence quotient and multiplying this by the standard deviation to the second power. Taking this sum and dividing by .6 of the standard deviation. Another word for standard deviation is sigma and from this point forward I will use S to represent a population’s sigma and s to represent a sample set sigma. In this situation, the first data set equation looks like:
The second data set returned 8.37 because the sigma for the second data set was bigger than the first. Both of these numbers need to be rounded up to the nearest whole number and then compared to the sample size. The first sample set is equal to the recommended smallest sample size however the second sample size falls short by one datum. This test leads me to believe that the sample sizes are not big enough to stand up to significant scrutiny. Be that as it may, the data was put into a distribution chart to compare the distribution patterns to see any significant difference however, there was no significant difference. The next step to finding if there was a change between the samplings was to test the sigmas in an f test. This test takes the larger sigma squared and divides by the smaller sigma squared to create f. Then compares the number of datum in the sample to an f chart that gives a range of numbers and if the f falls between the range specified for the number of datum in the sample then the sigmas are not significantly different. This test shows that there is not a 95% probability that the samplings are significantly different and therefore does not support Sam’s theory. Taking this to the next statistical significance test takes us to a t test. To be specific, the test used in this comparison is the t test of two sample averages. However, this equation gets a little complicated for words so, it is best to illustrate this computation. Before doing so we need to establish some symbology for each of the numbers. 1 = the mean of group X
2 = the mean of group Y n1 = the number of datum in group X
n2 = the number of datum in...
Please join StudyMode to read the full document