Volume 9, Issue 4
Tricks of the Trade
In and Out
Download This Issue
Staff and Contributors
Computational Order Statistics
4. Extensions and Forthcoming Features
Thus far, this article has assumed that we are dealing with samples of iid variables. In this section, we take the major step of relaxing these assumptions. The generalisation to non-identical distributions is an enormously flexible and powerful capability. To do so, we require the new Piecewise functionality found in Mathematica 5.1 or later. In the examples that follow, we provide an illustration/preview of this new functionality as already implemented in the developmental version of mathStatica and which will be available in its next public release.
Let denote a continuous random variable with pdf and cdf , such that are independent but not identical variables due to differing parameters , for . For example, consider an parent where identicality is relaxed by replacing with , for . Thus:
Then, the pdf of the minimum order statistic (in, say, a sample of size 4) is:
The pdf of the next largest order statistic, , is substantially more complicated:
Next, let us suppose we have three completely different distributions defined over three different domains of support. In the following, is the pdf of an , is the pdf of a standard Normal, and is the pdf of a random variable:
We can now solve completely general questions. For example, let us suppose we have a random sample of size . Of this sample, suppose that 10 values are drawn from the Normal, seven from the Exponential, and three from the Uniform. What is the pdf of the second smallest value from the sample, namely the second order statistic? Solving this problem would normally be enormously complicated, but the solution is now given simply by:
The output can be viewed in the electronic version of this notebook.
This same technology provides a neat way to solve problems such as finding the pdf of , when , and have completely different distributions and different domains of support. For our example, if , and , then the pdf of is simply the pdf of the first order statistic:
with domain of support:
Here is a plot of the pdf we have just derived:
We can easily 'check' our solution using Monte Carlo methods. Here are 100,000 pseudorandom drawings from each of the three distributions:
Next, we create 100,000 samples of size 3 containing one drawing from each of the three distributions, and then map the Min function across each sample, generating our 100,000 empirical drawings of the sample minimum:
Figure 8 compares the empirical pdf (---) of the data we have just generated with the theoretical pdf (---) derived earlier:
Independence Relaxed, Identicality Maintained
The distribution of an order statistic is derived as a many-to-one transformation from the joint distribution of . Although there are differing ways in which the distribution can be found, perhaps the simplest method uses equivalence events. To illustrate, the event is equivalent to , for all , where is arbitrarily chosen. Accordingly, the cdf of in terms of is given by
In the case of (and too), there is only one equivalent event. The number of equivalent events increases substantially as other inner order statistics are considered; however, in order to keep our discussion as simple as possible we will confine attention to from here on. This is not necessarily a case without interest, for the extremal is encountered in many practical contexts. Examples include as the measure of record high temperatures and record times in sports.
If the standard iid assumptions hold, then is a collection of mutually independent random variables, and all are copies of the same parent (with pdf and cdf ), leading to considerable simplification in the right-hand side of (1):
If, for example, is continuous, then the pdf of is obtained by differentiation of (2) with respect to , yielding .
Just as identicality can be relaxed in many ways, so too can independence. To introduce a dependence structure, we may begin by rewriting (1) in its copula form
where the second line recognises that are copies of a common parent . The -copula is the function that represents the dependence structure of . For example, the special case corresponds to mutual independence amongst . The representation (3) is due to Sklar  and is unique provided is continuous.
Tractable results can be obtained by assuming an Archimedean dependence structure for ; for details of these copulas see [11, Chapter 4]. Let denote the strict generator function associated with ; it is differentiable such that on , and its inverse must be completely monotonic if . For example, the generator associated with the independence copula is strict, and its inverse is completely monotonic. Then, the key property of the generator is that
where is the cdf of . Differentiating both sides of (4) with respect to and rearranging yields the pdf of :
where the denominator would be computed as per . The resemblance in the structure of the pdf (5) to the pdf in the iid case, , is striking.
To illustrate, let with pdf :
and cdf :
and summarise our assumptions:
Enter the details for a particular case considered by Ballerini , namely, that of the Gumbel-Hougaard family of -copulas with generator , with dependence parameter :
Then, from (5), the pdf of is given by
with domain of support:
Setting corresponds to the iid case; notice too that replacing in the iid pdf with yields the pdf of . This, for example, means that many algebraic results on the properties of can be found simply by replacing with in the appropriate iid formula. In Figure 9, the solid line denotes the pdf when (the iid case), while the dashed line denotes the pdf when .
About Mathematica | Download Mathematica Player
© Wolfram Media, Inc. All rights reserved.