Example: LU Decomposition
Here is the code for actually doing an LU decomposition (factorization) from the package
Here are two test examples so that we can compare timings of different versions.
This computes the ratio of the timing of the package function to Mathematica's internal algorithm.
Ouch! On my computer, the machine case is over 1000 times slower and the bignum example is over 5 times slower. Before screaming an exasperated "Mathematica is just too slow", let us try to improve the code, keeping in mind packed arrays for the machine number case.
The first thing I see is that this code is using
The results are identical, but there is a clear speed improvement.
The next change will help make the code more readable. At this point, it will not seem to make much of a speed difference, but it does help. The idea is to replace the lines of code finding the correct pivot to use with a function.
This finds which of the possible pivots (in column c, rows c and above) is largest in absolute value, and swaps with the one in row c.
This, in turn, uses a simple function.
And putting it into the main function we get the following.
As before, results are identical.
Still going in the right direction, but no big savings. The place to look for big savings is, of course, the innermost loop. The next change replaces the innermost loop by giving Mathematica commands that operate on all the elements of a list.
A big speed improvement, but we no longer have identical results.
The reason the results are different is that the order of operations will be slightly different with the new code. Floating arithmetic is unfortunately subject to small differences in the order of operations. As long as the differences are small, they are not of significance. In this example, the differences are quite small.
With bignums, we also got a substantial speed improvement.
So replacing the innermost loop was spectacularly successful. There is still a set of nested loops which can be worked on though.
The results are the same as above with a decent speed improvement.
It is worth noting that even though the machine number case is still substantially slower than Mathematica's internal code, the bignum case is nearly as fast.
Now is the point at which we have to use Mathematica's compiler to get additional speed improvements.
The last argument to
This makes an LU factorization command which returns both LU and pivots.
It is important that the scoping of
The results are slightly different again, but the new command is definitely faster. Realistically, this is about as close as you will be able to get to Mathematica's internal code, which is highly optimized.
Converted by Mathematica May 1, 2000