Volume 9, Issue 4
Tricks of the Trade
In and Out
Download This Issue
Staff and Contributors
Moment-Based Density Approximants
2. Approximants Based on Legendre Polynomials
A polynomial density approximation formula which applies to any continuous distribution having a compact support is obtained in this section. This approximant is derived from an analytical result stated in , which is couched in statistical nomenclature in this section.
The density function of a continuous random variable that is defined on the interval can be expressed as follows:
where is a Legendre polynomial of degree in , that is,
denoting the largest integer less than or equal to , and
with , wherein is replaced by the th moment of :
. Legendre polynomials can also be obtained by means of a recurrence relationship, which is derived for instance in [9, 178]. Given the first moments of , , and setting , the following truncated series denoted by can be used as a polynomial approximation to :
in Mathematica notation, where the pattern matching symbol (which is typed :> in Input mode) conveniently replaces each occurrence of in with . It should be noted that expressions involving (excluding the punctuation at the end of the formulae) can readily be used in a Mathematica notebook.
As explained in [14, 439], this polynomial turns out to be the least-squares approximating polynomial of degree that minimizes the integrated squared error, that is, . As stated in [15, 106], the moments of any continuous random variable whose support is a closed interval uniquely determine its distribution. Moreover, as shown by [10, 304], the rate of convergence of the supremum of the absolute error, , depends on and via a continuity modulus. It follows that more accurate approximants can always be obtained by making use of higher degree polynomials.
We now turn our attention to the more general case of a continuous random variable which is defined on the closed interval . We denote its density function by and its th moment by
As pointed out in the Introduction, alternative methods are available for evaluating the moments of a distribution when the exact density is unknown. On mapping onto by means of the linear transformation
we obtain the desired range for , that is, the interval . The th moment of , which is obtained as the expected value of the binomial expansion of is given by
Equation (6) can then be used to provide an approximant to the density function of . On transforming back to with the affine change of variable specified in equation (8) and noting that , we obtain the following approximate density function for :
in Mathematica notation. On combining equations (9) and (13), one obtains the following compact representation of the density approximant:
Now, observing that with replaced by as given earlier, is equivalent to with replaced by , we also obtain
Thus, given , , the first moments of a random variable defined on the interval , an th-degree polynomial approximation of its density function can be directly obtained from equations (14) or (15).
It should be noted that the density approximants so obtained may be negative on certain subranges of the support of their distributions having low density. This will likely occur if an insufficient number of moments are being used. However, by mere inspection of the approximate density plot, we should be able to determine whether a higher degree polynomial ought to be used. Indeed, owing to the convergence of the approximant, the density function will converge everywhere to a nonnegative number as more moments are being used. If we wish to obtain a truly bona fide density function, we could always take a normalized function, , which is initially defined as being equal to except on subintervals where the latter is negative, wherein it is set equal to zero.
In the following application, a polynomial approximation is obtained for the density of , the square of the distance between two points that are randomly distributed in the unit cube.
Example 1: Exact and Approximate Density Functions of V
Let and be two points in the unit cube whose coordinates, and , , are all independently and uniformly distributed in the interval , and let denote the square of the distance between these two random points, that is, , whose support is the interval .
The closed-form representation of the density function of that follows, which is believed to be original, was derived from the integral representations obtained by [16, Section 2.6.4] by making use of certain trigonometric identities as well as some of Mathematica's algebraic simplification routines:
where the functions , , and can be easily identified from the expression given in the Appendix for , and denotes the indicator function which is equal to one whenever belongs to the set and zero otherwise.
The th moment of can be evaluated by integrating whenever . Alternatively, if the density function of were not known, we could determine its th moment from the following integral representation:
whose evaluation can be handled by Mathematica. However, on noting that the density function of is , it is much more efficient to compute the th moment of as follows:
Figure 1 shows the exact probability density function (PDF) of obtained from equation (16) (solid line) superimposed on its thirteenth-degree polynomial approximation (dashed line) evaluated from equation (15) (or equivalently equation (13)) with and . The exact and approximate cumulative distribution functions (CDFs) which can easily be evaluated by integration, appear in Figure 2, and their difference is plotted in Figure 3. The code that was used for plotting the graphs and evaluating the various functions is provided in the Appendix. In general, note that in order to avoid round-off errors, it is advisable to carry out the calculations with rational numbers. In this example, the moments already are in rational form. When this is not the case, the command can be used to obtain rational representations.
Figure 1. Exact and approximate (dashed line) PDFs. [Pq or Pq1 in the Appendix]
Figure 2. Exact and approximate (dashed line) CDFs. [PQ in the Appendix]
As Figure 3 indicates, the exact and approximate CDFs differ by less than 0.001 over the interval in this case.
Figure 3. The difference between the exact and approximate CDFs. [Qd in the Appendix]
Example 2: Approximate Density of a Mixture of Beta Random Variables
Consider a mixture of two equally weighted beta distributions with parameters and , respectively. A fifteenth-degree polynomial approximation was obtained from the compact formula given in equation (14). The exact density function of this mixture and its approximant, both plotted in Figure 4, are manifestly in close agreement. A glance at the Mathematica code that is provided in the Appendix for this example should convince the reader that very little programming is indeed required. Clearly, methodologies that are based on only a few moments would fail to provide satisfactory approximations in this case.
Figure 4. Exact and approximate (dashed line) PDFs. [Pb in the Appendix]
As specified in Section 4, approximants that are expressed in terms of Jacobi polynomials are ideally suited for approximating beta-type density functions. However, in the absence of prior knowledge about the shape of a density function, it is indicated to make use of approximants based on Legendre polynomials as they can theoretically accommodate any continuous distribution defined on a closed interval. It should be pointed out that if a density function turns out to be very irregular, a prohibitive number of moments might be required to approximate it satisfactorily. Thankfully, the majority of continuous distributions of interest are smooth and possess at most a few modes.
About Mathematica | Download Mathematica Player
© Wolfram Media, Inc. All rights reserved.