Tuesday, August 26, 2008

F# For Scientists Misses the Boat On Mathematica Performance

I recently purchased F# For Scientists by Dr. Jon Harrop after the author mentioned it on the Mathematica Mailing List. According to Dr. Harrop,

Mathematica's .NET-Link technology allows Mathematica and .NET programs to interoperate seamlessly. Moreover, Microsoft's new functional programming language F# provides many familiar benefits to Mathematica programmers:
The marriage of Mathematica with F# can greatly improve productivity for a wide variety of tasks.

I am a big fan of Mathematica and functional programming and have been wanting to check out F# for some time so I decided to give the book a shot. It just arrived today so I can't post a full review but I did jump directly to the small section (5 pages) on using F# with Mathematica.

What did I learn? Well this section rightly claims that Mathematica has awesome symbolic math capabilities (it does). But then it goes on to claim that F# can beat the pants off of Mathematica on raw calculation. Thus it suggested F# programmers should call out to Mathematica for symbolic integration but then evaluate the result in F# for speed (to the tune of 3.4 times Mathematica's speed). I was naturally dubious. The explanation of this speed up is give as

The single most important reason for this speed boost is the specialization of the F# code compared to Mathematica's own general purpose term rewriter. ... Moreover, the F# programming language also excels at compiler writing and the JIT-compilation capabilities of the .NET platform make it ideally suited to the construction of custom evaluators that are compiled down to native code before being executed. This approach is typically orders of magnitude faster than evaluation in a standalone generic term rewriting system like Mathematica.


Okay, hold the phone! First off, I did not know the F# language could write compilers. I'll forgive this as poetic use of language. I guess I sort of know what he meant to say. More interesting is that we have gone from 3.7 times to "orders of magnitude". Now, I don't take anything away from the brilliant folks at Microsoft, but the equally brilliant folks at Wolfram have been focusing exclusively on mathematics software for 20 years and you might think they learned a thing or two about computational speed!

Here is the example from the book...

First, he uses Mathematica to integrate a function.


Integrate[Sqrt[Tan[x]],x]

(-2*ArcTan[1 - Sqrt[2]*Sqrt[Tan[x]]] +
2*ArcTan[1 + Sqrt[2]*Sqrt[Tan[x]]] +
Log[-1 + Sqrt[2]*Sqrt[Tan[x]] - Tan[x]] -
Log[1 + Sqrt[2]*Sqrt[Tan[x]] + Tan[x]])/(2*Sqrt[2])


He then goes to show that Mathematica takes 26 secondsto evaluate this function in loop for 360,000 iterations.

He then shows a translator that converts the Mathematica to F# and the F# code does the same work in 7.595 seconds.

So far Dr. Harrop is correct but like some many others who are in a rush to show their new favorite language superior to another's, he forgets to read the manual! Particularly, the section on optimization! If he had he would have found a handy little Mathematica function called Compile. Hmm, sounds promising. And in fact....


cf = Compile[{{x, _Complex}}, Evaluate[Integrate[Sqrt[Tan[x]],x]]]
Timing[Do[cf[x + y I],{x,-3.0,3.0,0.01},{y,-3.0,3.0,0.01}]]

{5.281,Null}

That's 5.281 seconds on my relatively underpowered laptop (Thinkpad X60) !

Some might feel I'm being a bit harsh on Dr. Harrop but after all he made me layout bucks for a book that promised me "many familiar benefits" only to deliver 5 measly pages of half truth. F# programmers may benefit from Mathematica but the jury is still out as to whether the reverse is true.

17 comments:

Jon Harrop said...

The F# code that you refer to in my book illustrates the basic process of interoperating with Mathematica to get the symbolic result of an integral that may then be manipulated in F#. The example given is evaluation of the symbolic expression. The power of this approach stems from Mathematica's concept of "everything is an expression", allowing the same process to be used to handle graphics and even sound.

If you want the best possible performance for expression evaluation then you need to get the expression compiled to native code. This is much easier with F# than with Mathematica because F# code is compiled to native code. So you just write the symbolic result as F# source code (or write a program to automate the translation, as described in The F#.NET Journal article). This reduces the time taken to only 1.5s with the latest F# but the process is easily parallelized in F# using Microsoft's Task Parallel Library, reducing the time taken to only 0.2s on this eight core machine, which is still over an order of magnitude faster than your optimized Mathematica and there is still plenty of room for optimizing the F# further.

Many other tasks show even greater performance improvements in F#. Specifically, tasks that cannot be handled by Mathematica's Compile and tasks that leverage parallel data structures where the high-performance concurrent garbage collector in .NET can be much faster than Mathematica's single-threaded reference counting.

I hope this clears up the questions you had about my book. I would love to discuss this further but only in an unmoderated forum, such as the comp.lang.functional news group.

Sal Mangano said...

First off, and this is just my opinion as a fellow author, if you are going to pitch your book to folks in a group you might be a bit more transparent about how much of the book deals with Mathematica per se. Not that Mathematica people are not interested in other things but you did not pitch it that way.

Second, once you compile something in Mathematica you are bypassing all the generic layers that are there because Mathematica must deal with symbolic, matrices and the like. So 99% of calculation in the example you gave is running in highly optimized machine code in the Mathematica Kernel. Do you really think Mathematica implements Tan[] in an interpreted fashion?

Third, Mathematica developers can take advantage of the Parallel Computing ToolKit so there is equal access to multi-core capabilities.

Fourth, I am sure that there are tasks that .NET can do better than Mathematica. Maybe you should have picked a better example that Scientists might appreciate. Leading scientists to believe they should abandon Mathematica to compute the results of an integral is not very helpful. If this was intended to be a "toy example" then you should have presented it as such in the book.

I hope this clears up what I found lacking about this section of your book. When I finish it I'll post a more thorough review.

Jon Harrop said...

Apologies if my post misled you but I believe many Mathematica users will find not only the Mathematica-specific content interesting but also the fact that Microsoft are starting to push functional programming languages related to Mathematica into the mainstream for general-purpose programming.

You have asserted that Mathematica's compilation "bypasses all generic layers" and results in "highly-optimized machine code" being executed but Mathematica is not a great native-code compiler as you imply. In fact, Mathematica's performance is typically worse than interpreted languages and far worse than native-code compiled languages like F#, C and Fortran. For example, tabulating a million values of Tan takes 0.4s in Mathematica, 0.33s in interpreted OCaml bytecode and 0.05s in F# because the F# is compiled to native code. This gap widens for more complicated programs. If you want an example of a program that is much slower in Mathematica, try porting this benchmark. The performance gap is closer to two orders of magnitude in that case and the program is only a few dozen lines long.

You also claim that Wolfram Research's Parallel Computing Toolkit provides "equal access to multi-core capabilities" but that toolkit imposes message passing which incurs the copying of all data passed between parallel threads, even on a shared memory machine where that copying is completely unnecessary. Consequently, the overheads for using this toolkit are orders of magnitude higher than the with the Task Parallel Library.

Sal Mangano said...

Ah, the hours wasted by programmers quibbling over whose language is "better, faster, stronger". What a waste.

The real reason for my post was, as I've said, was that you throw F# out there as an appropriate tool for Mathematica users but then your only example is trivial and misleading. One of the reasons it is trivial is that in general, Integrate may return functions that F# just does not have in its library. Does F# have Hypergeometric2F1, FresnelS, Erf, PolyLog, etc. All can be returned by pretty tame integrations. And it is misleading because it does not show how you can do the best possible job in Mathematica itself.

BUT, the real issue here, for me, stems from the title of your book "F# for Scientists". If it was "F# for Programmers" we would not be discussing this. You are selling F# as a productivity tool for Scientists and by implication, scientists who use Mathematica. I'll take you up on that one in my more thorough review of your book.


p.s. If you already know Mathematica is 2 orders of magnitude slower on ray tracing (not my first choice for somthing the avg scientists does in Mathematica) why don't you email me the Mathematica code you wrote to demonstrate that and save me the trouble.

Jon Harrop said...

On the contrary, learning how to solve problems orders of magnitude more quickly is extremely important to computational scientists and the best solution is undoubtedly to leverage a variety of different tools (of which Mathematica is one).

I have uploaded a direct translation of the first (least optimized) ray tracer benchmark implementation to Mathematica 6 here.

I am amazed that you do not know how relevant ray tracing is to scientific computing. If you are interested in learning more about this, I suggest comparing the hierarchical techniques used to accelerate ray tracing (such as the hierarchical spherical bounding volumes used in this benchmark) with the Fast Multipole Method used in cosmology and molecular dynamics. The basics of the latter are described in section 3.10.2 of F# for Scientists and, as you can see, the code for accurately approximating electrostatic forces between charged particles is almost identical to the Intersect routine of this ray tracer. This is one of many essential techniques from computational science that is vastly more efficient in F# than Mathematica.

Sal Mangano said...

I did not say ray tracing was not important just that it was not important to the average scientist. It may not even be important to the average physicist or seismologist. Linear algebra is certainly important to a large number of scientists so perhaps inverting a large matrix (1000 a 1000) would be an interesting comparison.

Thanks for the ray tracing sample. I'll fool around with it to see how its performance can be improved in Mathematica without changing the basic algorithm.

Jon Harrop said...

Ray tracing should be important to the average computational scientist not only because it is directly related to optics (physics) but because there is such a large overlap between the algorithms and data structures used in ray tracing and those used in many areas of scientific computing. Look at k-D trees, for example.

Your example of inverting large matrices should actually be much less interesting to scientists because most of their matrix inversion problems are best rephrased in terms of the more efficient and more numerically-robust LU decomposition. A single matrix inversion or LU decomposition is also uninteresting as a benchmark because it simply entails calling a single low-level function from a linear algebra library such as LAPACK. This would show Mathematica in an artifically-good light in terms of performance because real programs rarely just call one external function.

There are also a wealth of data structure intensive problems to be solved in scientific computing where Fortran is too inexpressive and Mathematica is too slow. So high-level languages that also provide high-performance, like F#, are well placed to open new avenues of research for scientists. This was actually the case for my own research into the structural properties of amorphous materials where exotic data structures facilitate forms of analysis that had been impossible before. My programs that took days to run in Mathematica now take seconds to run in OCaml and F#.

Sal Mangano said...

Thank you. Your response takes us full circle to what prompted my post in the first place. You claim have some experience in the types of problems that scientists are concerned with and that give Mathematica trouble. But rather than drawing on this experience to present some compelling examples you pick an Integration problem.

The problem with you book is that it fails to live up to its title. You spend the first 12 chapters mainly on F#. There is nothing wrong with that, after all you are introducing a language that presumably the reader has little experience with. But, if the book is called F# for Scientists, then the reader is probably expecting some meaty scientific examples. Instead you offer, Fast Fourier Transforms, Eigen values, Nth nearest neighbor graph problem, and logistic map. These problems are of some interest but they are all easily done in Mathematica or Matlab and I think most Mathematica/Matlab users would prefer working in these environments.

The way I see it, you drag the poor scientist through 280 pages of F# instruction with very little payoff at the end. If a scientist is already in bed with Mathematica, Matlab, R, C++ or even Fortran, I see little reason for him to jump ship to F# based on your presentation. Further, the bar is higher, because he or she would now be effectively tied to a single platform.

On the other hand, I might recommend you book to someone who was predisposed to learning F# but why did you call it "Programming F#" or some such title that more honestly reflects its content?

Anonymous said...

Honestly, I think this discussion is pointless.
Performance, performance, performance.
If all you care about is performance, just go with raw C or C++ code using Intel's highly processor-optimized Math Kernel Library. That will put to shame both Mathematica and F# and you know its true.
The reasons to pick either F# or Mathematica go beyond performance: simplicity, API completeness, representational power, programming clarity, etc.
And quite frankly I would say that for +90% of the guys out there it's a matter of taste, not reason.

Sal Mangano said...

Steve,

Mathematica is built on Intel's Math Kernel Library but of course well written C will be faster for other reasons.

Of course, you are right about the other reasons for choosing a language. 95% of Mathematica's users don't choose it for performance reasons anyway.

CSMR said...

Hey I think it's nice you got detailed response from the author. It's an old post and I don't know if you've read the whole book (I haven't yet myself) but I think you started at the wrong place if you wanted to know the advantages of F# over Mathematica for scientific computing.

The first chapter gives the main benefits of F# which have mostly to do with the static type system. I love much of Mathematica but am moving to F# because I have to spend most of my time finding bugs that that would have been automatically flagged in a statically typed language.

That's the key advantage, and then I want to know am I going to miss Mathematica's library of functions and visualization, and it's good to know there is some interoperability.

Regarding speed, in mathematica the "Compile" function types the variables but with only a handful of possible types. Which is a big restriction on the programs you can use it on.

I've had rather limited success with the Parralelize function; it's not well documented and I don't know what form expressions have to be in for Mathematica to know they are side-effect free. Again it seems like there are severe restrictions on program structure to be able to use this feature. I don't know what happens in F# though since afaik functions aren't known by the compiler to be pure or impure.

Jon Harrop said...

@CMSR

Another major problem with Mathematica's Compile is that it does not work on recursive functions.

Regarding parallelism, this is a problem for Mathematica because the entire language implementation is based around a giant mutable set of rewrite rules. The only feasible solution is to provide separate rewrite tables for separate threads of computation but copying an imperative rewrite table is slow and using purely functional data structures makes using it slow (and would require the addition of real garbage collection to Mathematica). However, Mathematica's comparatively awful performance renders parallelism rather academic anyway: even if Mathematica could be parallelized with zero overheads, you would still need hundreds of thousands of cores before it could compete with the performance of a modern compiled language running on a single core.

For example, here is a simple computation from optics where Mathematica is ~100,000x slower than languages like F#. Even the code optimized by Wolfram Research is still over 1,000x slower...

Sal Mangano said...

The 100,000x slower claim by Flying Frog is false and he knows it. It is a product of very poor Mathematica skills. It is a shame that he continues to perpetuate this even though it was demonstrated to be false via a bet made by Xah Lee

http://www.javakb.com/Uwe/Forum.aspx/java-programmer/45436/Mathematica-7-compares-to-other-languages.

Of course it is true Mathematica is slower than a compiled language but one doest not choose Mathematica over a compiled language based on its speed.

Anonymous said...

As I said to Xah at the time:

"You have only observed a speedup because you have drastically simplified the
scene being rendered. Specifically, the scene I gave contained over 80,000
spheres but you are benchmarking with only 5 spheres and half of the image
is blank!

Using nine levels of spheres as I requested originally, your version is not
measurably faster at all."

So my verifiable claim that Mathematica is 100,000× slower than compiled languages was entirely correct.

Sal Mangano said...

Okay, as so as my book is out the door, I'll rewrite your code to show that the 100000x number is ridiculous.

Anonymous said...

FWIW, Wolfram Research themselves already took up my challenge with Daniel Lichtblau producing the fastest Mathematica implementation but it was still 1,800× slower than my OCaml.

Suffice to say, these huge performance discrepancies leave plenty of room for more performant tools like F#.

Sal Mangano said...

See, you know the number is not 100000x but you insist on repeating it. This show you have ZERO credibility. I think 1800 would have been good enough for you to make your point.

I like challenges so I'll see if I can knock a few off of the 1800x but as I said in the past, you don't use Mathematica for ray tracing. You do use it for almost everything else.

For example, at IMUC 2009 I watched a keynote by Chris Carlson where he brought an amazing architectural project to life in just a few dozen lines of Mathematica. F# could not have pulled that off in a 100000 lines. So F# is 100000x less productive. Hey, if you can lie, why can't I.