# Chi Square Distribution - Explained

What is a Chi-Square Distribution?

# What is a Chi-Square (C2) Distribution?

In probability theory and statistics, the Chi-squared distribution also referred as chi-square or X2-distribution, with k degrees of freedom, is the distribution of a sum of squares of k independent standard regular normal variables. Chi-distribution is a unique case of a gamma distribution and is among the most broadly applied probability distribution in inferential statistics. It is used commonly in hypothesis evaluation or development of an acceptable range of deviation.

Back to: RESEARCH, ANALYSIS, & DECISION SCIENCE

## How is the Chi-Square (C2) Distribution Used?

The chi-squared is applied in the regular chi-squared tests for goodness of fit of a witnessed distribution to a hypothetical one. More specifically, it measures the independence of the two methods of a grouping of qualitative information and confidence range approximation for population standard deviation of the normal distribution from a representative standard deviation. Other mathematical studies such as Friedman's analysis of variance by ranks apply chi-square distribution. The chi-squared distribution is most commonly employed in hypothesis testing. Despite popular distributions, for instance, normal distribution and the exponential distributions, chi-square distribution is rarely applied in direct modeling of ordinary occurrences. It results in the following hypothesis evaluation:

• Chi-squared test of independence in contingency tables
• Chi-squared test of goodness of fit of observed data to hypothetical distributions
• Likelihood-ratio test for nested models
• Log-rank test in survival analysis
• CochranMantelHaenszel test for stratified contingency tables

Besides the above applications, chi-squared distribution is a part of the definition of t-distribution and F-distribution useful in t-tests which are an analysis of variance and regression analysis. The major reason for the extensive use of chi-square in postulate evaluation is its association to the normal distribution. Many hypothesis tests use test statistics, for example, t-statistic in a t-test. For these t-tests, as the sample size, n, increases the sample distribution of the test statistic moves to the normal distribution in a central limit theorem concept. As a result of test statistics being asymptotically normally distributed, given that the sample size is large enough, the distribution applied for hypothesis testing may be estimated by a normal distribution. The process of testing hypotheses using a normal distribution is well understood and is relatively easy. The simplest chi-squared distribution is the square of the standard normal distribution. In case of testing a hypothesis using a normal distribution, a chi-square distribution may be used. Additionally, Chi-squared distribution is generally applied is that it belongs to a class of likelihood ratio tests (LRT). LRTs possess favorable characteristics specifically; it provides the high power in the null hypothesis rejection. On the other hand, Normal and chi-squared estimations are invalid asymptotically, and this preference is given to a t-distribution instead of normal estimation or chi-squared approximation for small sample size. Ramsey indicated that exact binomial test is normally powerful than a normal approximation.

## The Chi-Square Statistic

Assume we perform the following statistical experiment. We choose a random sample of n from a normal population, with a standard deviation equal to . Standard deviation is found to be s. with this information we can define a statistic referred to as chi-square using this equation 2 = [ ( n - 1 ) * s2 ] / 2 The distribution of the chi-square statistic is referred to as the chi-square distribution. The chi-square distribution is given by the following probability density function: Y = Y0 * ( 2 ) ( v/2 - 1 ) * e-2 / 2 Where Y0 is a constant that depends on the number of degrees of freedom, 2 is the chi-square statistic, v = n - 1 is the number of degrees of freedom, and e is a constant equal to the base of the natural logarithm system (estimated 2.71828). Y0 is defined so that the area under the chi-square curve is equal to 1.