The chi-square test
The chi-square test is used to determine whether differences between observed and expected values are either the result of chance, or the result of other factors. It can be applied to one or more variables, each with two or more categories. Pearson's chi-square test is the most common. It requires independent observations and mutually exclusive and exhaustive categories. Furthermore, no expected value should be less than 1 and no more than 20% of expected values should be less than 5.
The chi-square test can be used to test for:
Independence. That is, it tests the null hypothesis that two or more variables are independent (the alternate hypothesis is that the variables are dependant). For example, smoking and lung cancer are not related.
Equality of proportions. That is, it tests the null hypothesis that the distribution of a variable is the same across multiple independent populations (the alternate hypothesis is that the distribution is different). For example, the distribution of lung cancer is the same across multiple ethnic groups.
Goodness of fit. That is, it tests the null hypothesis that the distribution of a variable within a population follows a hypothesised distribution (the alternate hypothesis is that it follows a different distribution).
Construct one table of observed and one table of expected values.
Calculate each cell's contribution to the chi-square statistic:
(observed value - expected value)^2 / expected value
Sum these values to give the chi-square statistic.
Optionally, calculate the Pearson's residual for each cell:
(observed value - expected value) / sqrt(expected value)
Calculate the degrees of freedom.
Use the chi-square statistic and the degrees of freedom to determine the p value, either with software or a table of critical values. Reject the null hypothesis if the p value is less than 0.05.
When testing for goodness of fit, use a hypothesised distribution. When testing for independence or for equality of proportions, use:
(row total * column total) / grand total
In other words, the joint frequencies (cell values) are affected only by the distribution of the marginals (row and column totals).
The Pearson's residual?
Clearly, where expected values are large, the raw differences between observed and expected values are also large. The Pearson's residuals provide more information than raw differences. The
chisq.test function in R also calculates standardised residuals, using a slightly different formula.
Degrees of freedom?
When testing for independence or for equality of proportions, use:
(rows - 1) * (columns - 1)
When testing for goodness of fit, use:
columns - 1
Agresti, A., 2007. An Introduction to Categorical Data Analysis. 2nd ed. Hoboken, NJ, USA: Wiley.
Boslaugh, S. and Watters, P.A., 2008. Statistics in a Nutshell. Sebastopol, CA, USA: O'Reilly.
R Core Team, 2012. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [online] Available at: http://www.R-project.org/ [Accessed 16 November 2012].