The 
Mathematica Journal
Volume 9, Issue 4

Search

In This Issue
Articles
Tricks of the Trade
In and Out
Trott's Corner
New Products
New Publications
Calendar
News Bulletins
New Resources
Classifieds

Download This Issue 

About the Journal
Editorial Policy
Staff and Contributors
Submissions
Subscriptions
Advertising
Back Issues
Contact Information

Applications of Generating Functions in Nonparametric Tests
Peter Weiß

The Kruskal-Wallis Analysis of Variance

The Kruskal-Wallis analysis of variance is a generalization of the Wilcoxon rank sum test. These two tests are related in the same way as the well-known analysis of variance and the two-sample Student -test.

Suppose that independent samples of sizes are drawn from continuous (not necessarily normal) populations. We want to test the null hypothesis (these populations are identically distributed) against the alternative hypothesis (these populations are not identically distributed).

No Ties

Let us assume first that there are no ties. In that case, we calculate the sum of the ranks of the in the combined ordered arrangement of these samples and reject the null hypotheses , if the Kruskal-Wallis statistics

with is too large.

Calculating the generating function of the null distribution of the Kruskal-Wallis test statistics, it is important to mention that

Thus, we get the generating function by applying the substitution

to

and the substitution

to the result of the first substitution.

Therefore, we have, for example,

It is well known (we refer again to Lehmann [2]) that for large values of it is possible to approximate the null distribution of the test statistics by the chi-square distribution with degrees of freedom. We are now in a position to study the quality of this approximation in the case of relatively small values of . It is important to remark that this approximation is not very good in the most important region between the 0.95 and the 0.99 quantile (Figure 3).

Figure 3.

Ties

If ties occur, we assign to tied observations the same midrank and calculate the sum of the midranks of the in the combined ordered arrangement of these samples. We reject the null hypotheses , if the modified Kruskal-Wallis statistics

with

and is too large.

Calculating the generating function of the null distribution of this test statistics, it is again essential to mention that

Thus, we get this generating function by applying the substitution

to

and the substitution

to the result of the first substitution.

Again we have, for example,

Lehmann [2] mentions that for large values of it is possible to approximate the null distribution of the test statistics by the chi-square distribution with degrees of freedom. By experimenting we can investigate the quality of this approximation graphically in the case of small values of . It is again important to remark that this approximation is not very good in the most important region between the 0.95 and the 0.99 quantile (Figure 4).

Figure 4.

The following picture shows the approximation of the null distribution of the Wilcoxon rank sum statistics with (Figure 2) and without (Figure 1) ties by an appropriate normal distribution and the approximation of the null distribution of the Kruskal-Wallis statistics with (Figure 4) and without (Figure 3) ties by an appropriate chi-square distribution.



     
About Mathematica | Download Mathematica Player
© Wolfram Media, Inc. All rights reserved.