· In a contingency table, the Chi-square test is a non-parametric (distribution-free) approach for comparing the association between two categorical (nominal) variables.
· If we have distinct treatments (treated and non-treated) and different treatment outcomes (cured and non-cured), we can apply the chi-square test for independence to see if treatments are connected to treatment outcomes.
· Because the Chi-square test is based on approximation (it returns an approximate p value), it necessitates a higher sample size. For more than 20% of cells, the anticipated frequency count should not be less than 5. If the sample size is tiny, the chi-square test is ineffective, and Fisher's exact test should be used instead.
· The chi-square independence test is not the same as the chi-square goodness of fit test.
Formula
Hypotheses for Chi-square test for independence
· The two category variables are independent, according to the null hypothesis (no association between the two variables) (H0: Oi = Ei)
· Hypothesis #2(alternate hypothesis): The two categorical variables are interdependent (there is an association between the two variables) ( Ha: Oi ≠ Ei )
· There is no one-tailed or two-tailed p value. The chi-square test's rejection zone is always on the right side of the distribution.
Chi-square test assumption
· Data is randomly sampled and the two variables are categorical (nominal).
· The levels of the variables are mutually exclusive.
· A contingency table's predicted frequency count for at least 80% of the cells is at least 5. For modest frequency counts, Fisher's exact test is appropriate.
· The predicted frequency count must be at least one.
· Observations should be separate from one another. Observation data should be frequency counts and not percentages, proportions or transformed data