Copyright 1983 by the American Psychological Association, Inc.
Statistical Significance, Power, and Effect Size: A Response to the Reexamination of Reviewer Bias Bruce E. Wampold
Department of Educational Psychology University of Utah
Michael J. Furlong and Donald R. Atkinson
Graduate School of Education University of California, Santa Barbara
In responding to our study of the influence that statistical significance has on reviewers' recommendations for the acceptance or rejection of a manuscript for publication (Atkinson, Furlong, & Wampold, 1982), Fagley and McKinney (1983) argue that reviewers were justified in rejecting the bogus study when nonsignificant results were reported due to what Fagley and McKinney indicate is the low power of the bogus study. The concept of power is discussed in the present article to show that the bogus study actually had adequate power to detect a large experimental effect and that attempts to design studies sensitive to small experimental effects are typically impractical for counseling research when complex designs are used. In addition, it is argued that the power of the bogus study compares favorably to that of research published in the Journal of Counseling Psychology at the time our study was completed. Finally, the importance of considering statistical significance, power, and effect size in the evaluation of research findings is discussed. Fagley and McKinney (1983) argue that we were not justified in concluding from our study of the manuscript reviewing process (Atkinson, Furlong, & Wampold, 1982) that a reviewer bias favoring statistically significant results was operating. The essence of their argument is that the power of the bogus study used as the manuscript stimulus in our design was, in their opinion, low (.79, .37, and .09 for large, medium, and small effects, respectively), and consequently reviewers were justified in rejecting it under the two statistically nonsignificant conditions included in our design. Additional analyses, however, reveal that the bogus manuscript had adequate power by conventional standards, and thus we believe1 our original conclusion that "a research manuscript reporting statistically significant findings is more likely to be recommended for publication than is a manuscript reporting statistically nonsignificant findings" (Atkinsonet al., 1982, p. 192) is valid. Regardless of the validity of the conclusions derived from our original study, Fag' ley and McKinney have raised an important issue that merits further elaboration. In this article we will demonstrate that power is a useful index The authors wish to thank two anonymous reviewers for their useful suggestions. Requests for reprints should be sent to Bruce E. Wampold, Department of Educational Psychology, 327 Milton Bennion Hall, University of Utah, Salt Lake City, Utah 84112.
when examined in conjunction with other indices, The Adequate Power of the Bogus Study Initially, it must be noted that Fagley and McKinney used the term low power as if it had an omnibus interpretation, when in fact it is a relative term. The bogus study, for example, clearly had adequate power to detect a large effect. Thus the reviewers of the bogus study could have been quite confident that the nonsignificant result indicated that a large effect was not present in the population, a very informative conclusion. Fagley and McKinney are correct in pointing out that the power of the bogus study to detect a small experimental effect was low. However, the only practical manner in which the power of the bogus study could have been increased, and thereby met their concerns, would have been to increase the sample size. For virtually all counseling research that use complex experimental designs, this is not a reasonable option. The bogus study was purposefully constructed to have a sample size that approximated the sample size of similar experiments...