# ANOVA

**Topics:**Analysis of variance, Statistical hypothesis testing, Multiple comparisons

**Pages:**17 (1286 words)

**Published:**December 19, 2013

Indian Institute of Public Health Delhi

MSc CR 2013-15

Outline of the session

• Need for Analysis of Variance

• Concept behind one way ANOVA

• Example

• Non-parametric alternative

When dependent variable is continuous

Type of

Dependent

variable

Type of

Independent

variable

Number

of

Groups

Continuous

Categorical

More

than

two

Non-parametric (Wilcoxon sign

rank)

Paired t – test

Not normal

Non-parametric (Wilcoxon sign

rank)

Independent z or t – test

Not normal

Non-parametric (Wilcoxon rank

sum or Mann-Whitney U )

Not normal

Unrelated or

independent

Not normal

Normal

Two

z or one sample t – test

Normal

Related

Choice of Significance test

Normal

NA

Distribution of

dependent

variable

Normal

One

Related/

Dependent

One way ANOVA/linear

regression

Non-parametric (Kruskal Wallis)

Normal

Repeated ANOVA

Not normal

Non-parametric (Friedmans test)

Unrelated

Related

Background

• When you have more than two groups to compare,

you can apply t-test multiple times

• But this is not done, why???

• Probability of type I error increases

• This increases as the number of comparison

increases

• Analysis of variance (ANOVA) is one way of dealing

with this problem which tests for overall significance

One way ANOVA

• Used to compare the mean of a numerical outcome

variable in the groups defined by an exposure level

with two or more categories

• Method is based on how much overall variation in the

outcome variables is attributable to differences

between the exposure group means

• This is equal to t-test for two sample with equal

variance

One-Way ANOVA

Partitions Total Variation

Total variation

Variation due to treatment

• Among Groups Variation

Variation due to random

sampling

• Within Groups Variation

One-Way ANOVA

• Difference between the means could be due to

variability between the groups and variability within

the groups

• Total variation= between group variation + within

group variation

• ANOVA, partitioned this sum of squares into two

– Sum of squares due to differences between the group means

– Sum of squares due to differences between the observations within each group, called as residuals

Test Statistic

• These sum of squares are divided by respective

degrees of freedom which is called as mean square

• The mean squares are then compared by using F test

• Hence, test statistic is given by

• With df = dfbetween groups, dfwithin groups = p-1, n-p

• Where n is total number of observations, p is

number of groups

Total Variation

SSTotal X11 X X 21 X X ij X 2

2

2

Response, X

X

Group 1

Group 2

Group 3

Treatment Variation

SST n1 X1 X n 2 X 2 X n p X p X 2

2

2

Response, X

X3

X

X2

X1

Group 1

Group 2

Group 3

Random (Error) Variation

SSE X11 X1 X 21 X1 X pj X p 2

2

2

Response, X

X3

X1

Group 1

Group 2

X2

Group 3

One-Way ANOVA: F-Test Statistic

• Test Statistic

– F = MST / MSE

SST /p 1

SSE /n p

• MST is Mean Square for Treatment

• MSE is Mean Square for Error

• Degrees of Freedom

– 1 = p -1

– 2 = n - p

• p = # Populations, Groups, or Levels

• n = Total Sample Size

One-Way ANOVA

Summary Table

Source of

Variation

Degrees

of

Freedom

Sum of

Squares

Mean

Square

(Variance)

F

Treatment

p-1

SST

MST =

SST/(p - 1)

MST

MSE

Error

n-p

SSE

MSE =

SSE/(n - p)

Total

n-1

SS(Total) =

SST+SSE

F Distribution

Assumptions for ANOVA

• Independent Random Samples are Drawn

• Outcome variable should follow normal distribution:

each group should be approx normal

• Standard deviations of each group are approximately

same

Hypothesis test

• ANOVA tests following...

Please join StudyMode to read the full document