Volume 9, Issue 4

Articles
In and Out
Trott's Corner
New Products
New Publications
Calendar
News Bulletins
New Resources
Classifieds

Editorial Policy
Staff and Contributors
Submissions
Subscriptions
Back Issues
Contact Information

Applications of Generating Functions in Nonparametric Tests

Two Combinatorial Problems

The crucial point in nonparametric test theory is that all possible arrangements of the ranks of the observed values are equally likely. To calculate, for example, the distribution density of a statistic Stat, which is based on ranks, it is therefore only necessary to obtain the number of cases fulfilling the condition . The two following combinatorial problems give essential ideas in doing this.

Problem 1

Suppose we have balls, which are numbered . In how many different ways

is it possible to divide these balls among urns, such that

• the -th urn contains balls and
• the sum of the numbers of these balls in urn is

with and ?

For small and we can calculate this number by counting the relevant partitions. For example, there are the following 15 partitions of balls into urns with balls in urn and balls in urn :

 U1 r1 U2 r2 U1 r1 U2 r2 U1 r1 U2 r2 {1,2,3,4} 10 {5,6} 11 {1,2,3,6} 12 {4,5} 9 {1,3,5,6} 15 {2,4} 6 {1,2,3,5} 11 {4,6} 10 {1,2,4,6} 13 {3,5} 8 {2,3,5,6} 16 {1,4} 5 {1,2,4,5} 12 {3,6} 9 {1,3,4,6} 14 {2,5} 7 {1,4,5,6} 16 {2,3} 5 {1,3,4,5} 13 {2,6} 8 {2,3,4,6} 15 {1,5} 6 {2,4,5,6} 17 {1,3} 4 {2,3,4,5} 14 {1,6} 7 {1,2,5,6} 14 {3,4} 7 {3,4,5,6} 18 {1,2} 3

Thus, and . If and are not as small as in this example, this method fails because there are too many partitions of balls into urns with balls in urn .

Problem 2

Suppose we have balls, which are marked by real numbers in the following way: of these balls have mark , have mark , and have mark (with ). In how many different ways

is it possible to divide these balls among urns, such that

• the -th urn contains balls and
• the sum of the marks of these balls is

with and ?

This problem is obviously a generalization of our first problem, since for we have

For small and it is again easy to calculate by counting the relevant partitions. If we mark of the balls with the mark (let us assume that these are the balls with the numbers 1, 2, 3) and the remaining balls with the mark , and if we put again of these balls in urn and of these balls in urn , we obtain (the same) 15 different partitions of our marked balls. (In the following table we show only their marks.)

 U1 r1 U2 r2 U1 r1 U2 r2 U1 r1 U2 r2 {1,1,1,2} 5 {2,2} 4 {1,1,1,2} 5 {2,2} 4 {1,1,2,2} 6 {1,2} 3 {1,1,1,2} 5 {2,2} 4 {1,1,2,2} 6 {1,2} 3 {1,1,2,2} 6 {1,2} 3 {1,1,2,2} 6 {1,2} 3 {1,1,2,2} 6 {1,2} 3 {1,2,2,2} 7 {1,1} 2 {1,1,2,2} 6 {1,2} 3 {1,1,2,2} 6 {1,2} 3 {1,2,2,2} 7 {1,1} 2 {1,1,2,2} 6 {1,2} 3 {1,1,2,2} 6 {1,2} 3 {1,2,2,2} 7 {1,1} 2

Thus and . If and are not as small as in this example, this method fails again.

A Simple Method

The aim of this section is to find the generating functions for the numbers and with , , , and .

With some experience it is easy to see that our generating functions must have two sorts of variables: for each urn one variable governing the number of balls in this urn and one variable governing the sum of the numbers (marks) of the balls in this urn. With this remark in mind, the generating functions for the numbers and are obviously the polynomials

To be more precise, the numbers and are the coefficients of

of the polynomials and .

In Mathematica, we define these two polynomials and recursively:

Now we get the numbers and by selecting the coefficients of the polynomials and :

Using these functions, we solve the combinatorial problems posed at the beginning of this section and a problem, which can not be solved by hand.

A More Sophisticated Method

Unfortunately this easy method wastes time and memory (for instance, the last calculation needs more than 100 MB RAM). This disadvantage is because the number of terms of the polynomials and are of order , which is a huge number even if and are not very large. We overcome this difficulty by fixing . In this case the number of terms of the generating functions and of and are only of order . This is also a large number, but an essentially smaller one than .

It is obvious that these new generating functions and are the coefficients of

of our old polynomials and . Of course we do not use this fact, since it would not save any memory. What we do is calculate these new generating functions and recursively using the obvious formulas

and some self-evident boundary conditions, if some of the 's are zero or is zero.

Now we get and by selecting the coefficients of

of the polynomials and :

The following examples show the efficiency of this method: