# Analysis of Variance

An analysis method for a quantitative outcome and two categorical explanatory variables.

If an experiment has a quantitative outcome and two categorical explanatory variables that are deﬁned in such a way that each experimental unit (subject) can be exposed to any combination of one level of one explanatory variable and one level of the other explanatory variable, then the most common analysis method is two-way ANOVA. Because there are two diﬀerent explanatory variables the eﬀects on the outcome of a change in one variable may either not depend on the level of the other variable (additive model) or it may depend on the level of the other variable (interaction model). One common naming convention for a model incorporating a k-level categorical explanatory variable and an m-level categorical explanatory variable is “k by m ANOVA” or “k x m ANOVA”. ANOVA with more that two explanatory variables is often called multi-way ANOVA. If a quantitative explanatory variable is also included, that variable is usually called a covariate. In two-way ANOVA, the error model is the usual one of Normal distribution with equal variance for all subjects that share levels of both (all) of the explanatory variables. Again, we will call that common variance σ 2 . And we assume independent errors.

267

268

CHAPTER 11. TWO-WAY ANOVA

Two-way (or multi-way) ANOVA is an appropriate analysis method for a study with a quantitative outcome and two (or more) categorical explanatory variables. The usual assumptions of Normality, equal variance, and independent errors apply.

The structural model for two-way ANOVA with interaction is that each combination of levels of the explanatory variables has its own population mean with no restrictions on the patterns. One common notation is to call the population mean of the outcome for subjects with level a of the ﬁrst explanatory variable and level b of the second explanatory variable as µab . The interaction model says that any pattern of µ’s is possible, and a plot of those µ’s could show any arbitrary pattern. In contrast, the no-interaction (additive) model does have a restriction on the population means of the outcomes. For the no-interaction model we can think of the mean restrictions as saying that the eﬀect on the outcome of any speciﬁc level change for one explanatory variable is the same for every ﬁxed setting of the other explanatory variable. This is called an additive model. Using the notation of the previous paragraph, the mathematical form of the additive model is µac − µbc = µad − µbd for any valid levels a, b, c, and d. (Also, µab − µac = µdb − µdc .) A more intuitive presentation of the additive model is a plot of the population means as shown in ﬁgure 11.1. The same information is shown in both panels. In each the outcome is shown on the y-axis, the levels of one factor are shown on the x-axis, and separate colors are used for the second factor. The second panel reverses the roles of the factors from the ﬁrst panel. Each point is a population mean of the outcome for a combination of one level from factor A and one level from factor B. The lines are shown as dashed because the explanatory variables are categorical, so interpolation “between” the levels of a factor makes no sense. The parallel nature of the dashed lines is what tells us that these means have a relationship that can be called additive. Also the choice of which factor is placed on the x-axis does not aﬀect the interpretation, but commonly the factor with more levels is placed on the x-axis. Using this ﬁgure, you should now be able to understand the equations of the previous paragraph. In either panel the change in outcome (vertical distance) is the same if we move between any two horizontal points along any dotted line. Note that the concept of interaction vs. an additive model is the same for ANCOVA or a two-way ANOVA. In the additive model the eﬀects of a change in

269

10

10

q...

Please join StudyMode to read the full document