Kruskal Wallis Test Definition
The Kruskal–Wallis test refers to a rank-based nonparametric test that is used in statistical research or calculations to determine existing differences in the statistical differences between groups of independent variables on dependent variables. Also known as one-way ANOVA on ranks or Kruskal–Wallis H test, the test is an extended alternative of the Mann-Whitney U test which is used to statistically compare more than 2 independent variable groups. When using Kruskal–Wallis test, no assumptions are needed, unlike ANOVA where it is assumed that there is a normal distribution of dependent variable as well as an equal variance among group scores. Thus, we can apply Kruskal–Wallis test in both ordinal- and continuous-level dependent variables. It is, however, noted that ANOVA is more powerful than Kruskal–Wallis test.
A Little More on Kruskal Wallis Test
A Kruskal–Wallis test generally indicates that within a group of dependent variables, one sample dominates over another sample stochastically. However, the test fails to identify how the dominance occurs or even how many groups can result in the stochastic dominance. In order to analyze a specific sample pair for the stochastic dominance, the recommended methods to use include Dunn’s test, Conover, or Mann-Whitney tests.
Unlike the analogous one-way variance analysis, the Kruskal–Wallis test does not assume a normal residual distribution. In the case that a researcher is able to assume that all groups have identically scaled and shaped distributions, then the null hypothesis is that all the groups have equal medians. In this case, the alternative hypothesis would hold that at least one population median of a single group is different from the median of the population of at least another group.
Examples of questions to be answered using the Kruskal–Wallis test
- Do the scores of job satisfaction differ by race/ethnicity?
- What is the test scores difference between two variable grade levels in an elementary school?
In order to answer the above questions, it is important to note that Kruskal–Wallis test approximates a chi-square distribution if the number of observations in every group is at least 5. In case the calculated value of the Kruskal–Wallis test is less than the critical value of chi-square, then the null hypothesis would not be rejected. On the other hand, if the calculated value of the test is greater than the value of chi-square, then the researcher is able to reject the null hypothesis.
Exact probability tables
In order to compute the exact probabilities for Kruskal–Wallis test, research would need several computing resources. Software can only provide exact probabilities for studies with less sample size of 30 participants. Such programs require asymptotic approximation for samples with larger sizes.
Studies have revealed in the past that the exact probability of larger sample sizes exists. For instance, in 2003, Spurrier published probability tables for a sample size of 45 participants. Later in 2006, Meyer and Seaman produced similar probability distribution using a larger sample size of 105 participants.
References for Kruskal Wallis Test
Academic Research on Kruskal Wallis Test
- A generalized Kruskal–Wallis test for comparing K samples subject to unequal patterns of censorship, Breslow, N. (1970). Biometrika, 57(3), 579-594. According to this paper, when there is need to carry out a test that compares the K samples in relation to unequal patterns of censorship, a proposal is made on a generalized Kruskal-Wallis test which comprises of Gehanis generalization of Wilcoxon’s test. The censoring variables are distributed depending on the type of population i.e. the distribution differs for different population. Another technique that is proposed to be used in the case where there is assumption of the censoring distribution to be equal is statistic. Despite the censoring variables being random or fixed numbers, these statistics still have a characteristic of asymptotic distributions under their null hypotheses. Calculations on asymptotic power and efficiency are made together with the provision of several examples.
- The Kruskal–Wallis test and stochastic homogeneity, Vargha, A., & Delaney, H. D. (1998). Journal of Educational and Behavioral Statistics, 23(2), 170-192. This article asserts that when a comparison is made on more than two samples that are not dependent, the Kruskal-Wallis H test is a procedure preferred in most situations. Despite this test being the most preferred, it has some unclarified points amongst the behavioral scientist and these points include; the exact null, alternative hypotheses and assumptions of the test. This article also tries to give a clarification on the Kruskal-Wallis inconsistency and controversy treatments using the stochastic homogeneity
- Kruskal‐Wallis Test, McKight, P. E., & Najab, J. (2010). The corsini encyclopedia of psychology, 1-1. This paper claims that Kruskal and Wallis(1952) supports that, the Kruskal-Wallis is a statistical test that is non-parametric and assesses the differences among three or more independently sampled groups on a single and non-normally distributed random variable. This test should be done using the non-normally distributed data like ordinal or rank data. According to this paper, Kruskal is a more generalized form of the Mann-Whitney U test since it extends the two-group that is Mann-Whitney U test and it is also the non-parametric version of the ANOVA.
- An empirical comparison of the ANOVA F-test, normal scores test and Kruskal–Wallis test under violation of assumptions, Feir-Walsh, B. J., & Toothaker, L. E. (1974). Educational and Psychological Measurement, 34(4), 789-799. This paper gives a presentation on the research that has been done on the three tests that is the ANOVA F-test, the Kruskal-Wallis test and the normal scores test based on the empirical alpha and empirical power with samples from each test. The use F-test is supported by empirical evidence when testing hypotheses about means under the violation of assumptions. The researcher uses Kruskal-Wallis test when testing the hypothesis on medians since it is more effective than the F-test. According to the results that were found from the investigation, the normal scores could not be recommended to be used in the basis of this research since it was not better than the F-test and ANOVA test.
- New approximations to the exact distribution of the Kruskal–Wallis test statistic, Iman, R. L., & Davenport, J. M. (1976). Communications in Statistics-theory and methods, 5(14), 1335-1348. According to this paper, the existing several approximations to the exact distribution of the Kruskal-Wallis can be grouped into two classes. These classes are, computationally difficult with good accuracy and the other one is the easy to compute but not as accurate as compared to the first one. This paper introduces the new approximations and it also gives a comparison between these new approximations and the popular ones. Exact probabilities are used where necessary by the comparisons and Monte Carlo simulation is also used but unless otherwise.
- Power study of anova versus Kruskal–Wallis test, Hecke, T. V. (2012). Journal of Statistics and Management Systems, 15(2-3), 241-247. This paper gives a description on the comparison between the anova and the Kruskal-Wallis test when the assumption about normally distributed populations is violated. In this study, the method of permutation is used in place of simulation method when the power of the test is determined. According to the results of this study, the non-parametric Kruskal-Wallis test performs better compared to the parametric equivalent anova method in the case of asymmetric populations.
- A multivariate Kruskal–Wallis test with post hoc procedures, Katz, B. M., & McSweeney, M. (1980). Multivariate Behavioral Research, 15(3), 281-297. This paper gives a presentation on a statistic statement that is non-parametric analogue to one-way MANOVA. It is also shown that the multivariate of the non-parametric Kruskal-Wallis test in 1952 is statistic. The sample reference distribution and a set of computational formulas for test statistic are derived together. Two hoc procedures are developed and compared and an illustration is made with data obtained from the behavioral sciences.
- Estimation of the power of the Kruskal‐Wallis test, Mahoney, M., & Magel, R. (1996). Biometrical Journal, 38(5), 613-630. This article claims that the authors’ interest is based on the estimation of power of the Kruskal-Wallis one way of analyzing the variance using ranks test for allocation of shift. The reason why authors majors on estimation is because they may not have adequate knowledge since when calculating the power of statistical test, the underlying population distribution(s) must be completely specified. The authors carries out an investigation on an extended data-based power estimation method that is presented by Collings and Hamilton(1988).The Kruskal-Wallis test for a location shift needs to be performed in order for this investigation to be carried out. The advantage of using this method is that it utilizes the techniques used to produce the estimated power in terms of empirical cumulative distribution functions of the sample data.
- The ANOVA F-Test Versus The Kruskal–Wallis Test: A Robustness Study., Feir, B. J., & Toothaker, L. E. (1974). According to this paper, a comparison between Kruskal-Wallis test and the ANOVA F-test was done based on factors like probability of a Type 1 error and patterns of sample size inequality. The researchers decided to make this comparison since they are always in dilemma to choose between parametric or non-parametric procedures to be used in a case where there is violation of parametric assumptions. The results that were found showed that the Kruskal-Wallis test is competing with the F-test in terms of alpha and not power. The authors claims that in terms of power, the Kruskal-Wallis test was grossly affected while for F-test, the types of specified mean difference was found to be generally resilient.
- Methodology and Application of the Kruskal–Wallis Test., Ostertagová, E., Ostertag, O., & Kováč, J. (2014). Applied Mechanics & Materials, (611). This paper gives a description on the methodology and application of Kruskal-Wallis test. This test is used for making a comparison involving more than two independent samples and to test whether these two samples come from the same distribution or not. The Kruskal-Wallis test is also said to be a powerful alternative to the one-way analysis of variance. When the research is carried out and the Kruskal-Wallis is found to be significant, then the multiple comparison tests will be considered as useful methods for more analysis.
- A note on power and sample size calculations for the Kruskal–Wallis test for ordered categorical data, Fan, C., & Zhang, D. (2012). Journal of biopharmaceutical statistics, 22(6), 1162-1173. The power and sample size procedures that were proposed by Fan et al., (2011) are generalized in this article. According to the simulation, the proposed power and sample size formulas are performing well. The application of the methods is demonstrated using a myelin oligodendrocyte glycoprotein (MOG) induced experimental autoimmunce encephalomyelitis (EAE).