Gradient of a Pure Function

Fabio Cavallini
Osservatorio Geofisico Sperimentale di Trieste
P.O. Box 2011, 34016 Trieste, Italy

The gradient is an operator that associates a vector field to a scalar field. Both scalar and vector fields may be naturally represented in Mathematica as pure functions. However, there is no built-in Mathematica function that computes the gradient. The standard package


has a function [Graphics:../Images/tricks_gr_158.gif] to compute gradients,


but [Graphics:../Images/tricks_gr_161.gif] operates on expressions like x+y+z rather than pure functions. Moreover, a maximum of three independent variables is allowed, and their names must be fixed a priori. For example:


This note presents and compares four ways to implement the gradient operator for pure functions. The author acknowledges helpful conversations with G. Seriani.

Roman Maeder's Programming in Mathematica [Maeder 1997, 116] contains an elegant implementation of the gradient:


This implementation allows for any number of Cartesian independent variables, but it too operates on expressions and not on pure functions. One can overcome these limitations by defining the operator


We use the symbol ▽ ( \[ EmptyDownTriangle ] ) and not ∇ ([Graphics:../Images/tricks_gr_166.gif]), which is an prefix operator. The operator [Graphics:../Images/tricks_gr_167.gif] computes the gradient of a pure function of an arbitrary number of independent variables. For example, given the pure function


or the alternative forms


the gradient at the generic point [Graphics:../Images/tricks_gr_171.gif] is


and analogously for [Graphics:../Images/tricks_gr_174.gif] and [Graphics:../Images/tricks_gr_175.gif].


The only limitation is that the functions to which [Graphics:../Images/tricks_gr_178.gif] is applied must be defined in terms of Cartesian coordinates. However, in dimension higher than three this is by far the most common situation and, in dimension two or three, similar routines may be written (using the appropriate mathematical formulae) to cope with the spherical, cylindrical, or any other coordinate system. The main problem with [Graphics:../Images/tricks_gr_179.gif] is that the derivatives are re-evaluated each time the gradient is applied. Hence, it is not efficient for repeated numerical computations.

For pure functions, shorter and more efficient code is possible:


However, [Graphics:../Images/tricks_gr_183.gif] does not work with the alternative forms [Graphics:../Images/tricks_gr_184.gif]or [Graphics:../Images/tricks_gr_185.gif].

A third solution has the flexibility of [Graphics:../Images/tricks_gr_186.gif] and the efficiency of [Graphics:../Images/tricks_gr_187.gif]; the only penalty for the improvement is that we must use an extra argument (the number n of independendent variables):


For example:


One might argue that the preceding solutions constitute, in some sense, a way of "cheating." Indeed, these algorithms temporarily create an expression from a pure function, then perform the derivatives, and turn the result into a function again, but do not handle the abstract differential operators as such. The real problem is the lack of an "operator algebra" in Mathematica; that is, rules for manipulating operators. A step in this direction is achieved with the package Notation.m.


We introduce the notation


which applies a function of operators to the same argument.


We may define the [Graphics:../Images/tricks_gr_203.gif]-dimensional gradient as


For example, the two-dimensional gradient


applied to [Graphics:../Images/tricks_gr_207.gif] and evaluated at [Graphics:../Images/tricks_gr_208.gif] yields


and similarly for [Graphics:../Images/tricks_gr_211.gif].

With this formalism, any differential operator is easily constructed. The [Graphics:../Images/tricks_gr_212.gif]-dimensional divergence


gives, for example,


Here is the [Graphics:../Images/tricks_gr_218.gif]-dimensional Laplacian:


The operator [Graphics:../Images/tricks_gr_222.gif] is applicable to functions defined in any of the three possible ways (as [Graphics:../Images/tricks_gr_223.gif], [Graphics:../Images/tricks_gr_224.gif], or [Graphics:../Images/tricks_gr_225.gif] above), but is not that fast or efficient. [Graphics:../Images/tricks_gr_226.gif] is the best solution when the given function is defined as [Graphics:../Images/tricks_gr_227.gif], but does not work in the other cases. Possibly the best compromise in terms of flexibility, efficiency, and conciseness is obtained by [Graphics:../Images/tricks_gr_228.gif]. Finally, [Graphics:../Images/tricks_gr_229.gif] has the same good qualities as [Graphics:../Images/tricks_gr_230.gif] and is even more elegant from the mathematical point of view.

Document converted by Mathematica of Wolfram Research