1. INTRODUCTION

1.1 χ2 distribution and its properties

A chi-square (χ2) distribution is a set of density curves with each curve described by its degree of freedom (df). The distribution have the following properties: - Area under the curve = 1

- All χ2 values are positive i.e. the curve begins from 0 (except for df=1) increases to a peak and decreases towards 0 as its asymptote - The curve is skewed to the right, and as the degree of freedom increases, the distribution approaches that of a normal distribution

Fig. 1 Graph of χ2 distribution with differing degrees of freedom

Each χ2 value is computed by the formula:

χ2 =Σ (O-E)2

E

where O = observed counts from the sample Equation 1 and E= expected counts based on the hypothesized distribution

1.2 Types of χ2 tests and their purpose

For a single population, to determine if the observed distribution in the population conforms to a specific known distribution or a previously studied distribution, the χ2 test for goodness-of-fit can be used. An example of this usage include: Mendel’s genetic model predicts that the phenotypic distribution of two phenotypes, each phenotype having a dominant and recessive allele, will follow the ratio of 9:3:3:1. A study done to confirm this makes use of χ2 test for goodness-of-fit to determine if the observed population fits into the theoretical model. We will discuss this example in detail in the next section. To compare the distribution of two populations, the χ2 test for homogeneity of population can be used. The data in this case can be represented in a two way table with the different populations in the rows and the distribution data based on certain categorical variable in columns. To test if the distribution of categorical variable is the same across several populations, the χ2 test for homogeneity of population is used. An example of this can be to find out if the proportions of teachers with PhD teaching a specific level in high school across three different countries are the same.

Data can also be represented in a two-way table even when it is taken from the same population, but divided according to two categorical variables, one in the row and one in the column. To test if the variables have any relationships, the χ2 test for association/independence can be used.

For instance, if we want to find out if the choice of university (urban, suburban, rural) is associated with the place the student live in (urban, suburban, rural). We could tabulate the data in a 3 by 3 table and carry out a chi-square test for association/independence.

2. TEST FOR GOODNESS OF FIT

In a single population with a hypothesized distribution with n outcomes, i.e. each observation falls into one of n possible outcomes and the proportion of the outcomes, number of observations in each outcome divided by the total number of observations, is hypothesized to follow a certain predetermined distribution.

We test the null hypothesis

H0: The actual population proportions are equal to the hypothesized population proportion

The chi-square statistics are calculated using Equation 1,

X2 = Σ (O-E)2/E

where the expected count is calculated by multiplying the hypothesized proportions to the total number of observation.

Conditions for using the χ2 test for goodness of fit:

1.All individual expected counts must be at least 1

2.No more than 20% of the expected counts are less than 5

The X2 (chi-square statistic) has approximately a χ2 distribution with (n – 1) degrees of freedom.

For testing the H0 vs the alternate hypothesis,

Ha: The actual population proportions are not equal to the hypothesized population proportions the P-value is P(χ2 >= X2).

To evaluate the P-value using graphing calculator, the input command under Home is as follow: TIStat.chi2Cdf (X2, ∞, df)

The command can also be found in Apps/Home/Catalog/F3 Flash Apps/chi2Cdf(

Value returned will be the...