29

I like F# ; I really, really do. Having been bitten by the "functional programming"-bug, I force myself to use it when I have the opportunity to. In fact, I recently used it (during a one week vacation) to code a nice AI algorithm.

However, my attempts so far (see a SO question related to my first attempt here) seem to indicate that, though undoubtedly beautiful... F# has the slowest execution speed of all the languages I've used.

Am I doing something wrong in my code?

I verbosely explain what I did in my blog post, and in my experiments, I see OCaml and the rest of the group running anywhere from 5x to 35x faster than F#.

Am I the only one with such experiences? I find it disheartening that the language I like the most, is also the slowest one - sometimes by far...

EDIT: Direct GitHub link, where the code lives in various language forms...

EDIT2: Thanks to Thomas and Daniel, speed improved considerably:

  • Greatest speed boost: moving from "ref" to "mutable" gave a whopping 30%.
  • Removing exceptions and using while/flagChecks gave another 16%.
  • Switching from discriminated unions to enums gave another 5%.
  • "inline" gave 0.5-1%

EDIT3: Dr Jon Harrop joined the fight: 60% speedup, by making ScoreBoard operate directly on the "enumerated" version of the data. The imperative version of F# now runs 3-4 times slower than C++, which is a good result for a VM-based runtime. I consider the problem solved - thanks guys!

EDIT4: After merging all optimizations, these are the results (F# reached C# in imperative style - now if only I could do something about functional style, too!)

  • real 0m0.221s: That was C++
  • real 0m0.676s: That was C# (imperative, C++ mirror)
  • real 0m0.704s: That was F# (imperative, C++ mirror)
  • real 0m0.753s: That was OCaml (imperative, C++ mirror)
  • real 0m0.989s: That was OCaml (functional)
  • real 0m1.064s: That was Java (imperative)
  • real 0m1.955s: That was F# (functional)
Community
  • 1
  • 1
ttsiodras
  • 10,602
  • 6
  • 55
  • 71
  • I think your question is a good one but needs a more positive question-like title. Maybe something like "F# seems slower than other languages. How can I speed it up?" – Zan Lynx Jul 22 '11 at 18:03
  • I agree with @Zan. Your question seems to be a rant in its current state. – ChaosPandion Jul 22 '11 at 18:06
  • 6
    I think that the question still feels more like a rant. You reference a blog post, but it's unlikely that someone would read a long post in order to answer your question. Could you refine the question and find one particular piece of code that is slower in F# and that is short enough to fit into a question? – Tomas Petricek Jul 22 '11 at 18:19
  • I fear that that's the point - it's not microbenchmarks that triggered my post, it's the fact that once you get down to it and write a complete algorithm, you find out that it runs a lot slower than what you'd expect from a compiled language. I've seen it twice so far, and the first time, Jon Harrop found a solution - I'm sort of hoping for a cameo... – ttsiodras Jul 22 '11 at 18:29
  • In case it helps, I added a link to the GitHub repos where both the functional and imperative forms of the F# implementation can be seen. – ttsiodras Jul 22 '11 at 18:56
  • 1
    @ttsiodras - Well, how did Jon Harrop find the solution in the first case? I assume he bechmarked the program and found the bottle-neck. Then he replaced the slow part with something faster. To ask a question, you need either a reasonably sized code sample (that someone can benchmark) or know what the bottle-neck is (so that someone can advise on more efficient approach). – Tomas Petricek Jul 22 '11 at 18:59
  • The bottleneck is the "ScoreBoard" function (85% of the execution time), which I first implemented in functional style, and got an execution time of ...a minute. I then rewrote it in imperative form, which is shared in the two F# implementations (abminimax in functional/imperative) - but even when going all imperative, my F# code runs at half the speed of C# ... And you can benchmark - just checkout from GitHub and run "make benchmark" – ttsiodras Jul 22 '11 at 19:05
  • from taking a quick look at your code, it seems that you are using the List module everywhere, doing multiple iterations of it, mapping, etc. i think you would find some improvements if you tried to use the Seq instead, also, for readability, you should probably break up your functions into smaller chunks – Alex Jul 22 '11 at 19:11
  • 1
    @ttsiodras: Your C++ code uses the `board` array in-place whereas your F# implementation copies it unnecessarily in the inner loop. You are also using arrays like `[|(0,0); (1,1); (2,2); (3,3)|]` that are obviously pointless. Fixing these obvious discrepancies immediately makes the F# almost 2× faster than before. – J D Jul 23 '11 at 11:24
  • @Jon: (1) What do you mean "copies it in the inner loop"? Just in case you reviewed the wrong code, the comparison must be made with "F#/score4.fs", not "F#/score4_functional.fs" - where indeed there exists list processing (though I am not sure that implies copying - and even if it does, the same code (line by line!) runs 4x-5x under OCaml...) (2) The arrays you are referring to exist under C++ too, so I don't understand what you mean... Can you send me the code after your fixups, so I can see the differences you refer to? – ttsiodras Jul 24 '11 at 01:57
  • @Tomas: I am sad you decided to close this question - I don't understand why you consider it "not real", since it includes code and reproducible benchmarks, and it has already triggered responses with real speed benefits... If nothing else, 5 people so far have considered it a favorite question... – ttsiodras Jul 24 '11 at 02:08
  • @ttsiodras: I agree that your question has merit, but it probably fits better at http://codereview.stackexchange.com since the "answer" would consist of iterative and systematic improvements to your code, not a single, definitive response. – Daniel Jul 24 '11 at 03:34
  • @ttsiodras: Your `scoreBoard` function accesses `board` directly in your C++ but copies it into a new heap-allocated `scores` array in your F# and OCaml, requiring 1,300,000 unnecessary allocations totalling around 100Mb of space! You even commented on this difference between the programs yourself in your C++ code. – J D Jul 24 '11 at 13:39
  • Temporary blindness - indeed, 60% speedup! Commited to github, thanks a lot! – ttsiodras Jul 24 '11 at 19:01

2 Answers2

17

Unless you can give a reasonably sized code sample, it's difficult to tell. Anyway, the imperative F# version should be as efficient as the imperative C# version. I think one approach is to benchmark the two to see what is causing the difference (then someone can help with making that bit faster).

I briefly looked at your code and here are some assorted (untested) suggestions.

  • You can replace discriminated union Cell with an enum (this means you'll use value types and integer comparison instead of reference types and runtime type tests):

    type Cell =    
      | Orange = 1
      | Yellow = 2
      | Barren = 3
    
  • You can mark some trivial functions as inline. For example:

    let inline myincr (arr:int array) idx =
      arr.[idx] <- arr.[idx] + 1
    
  • Don't use exceptions for control-flow. This is often done in OCaml, but .NET exceptions are slow and should be only used for exceptions. You can replace the for loop in your sample with a while loop and a mutable flag or with a tail-recursive function (a tail-recursive function is compiled into a loop, so it will be efficient, even in imperative solution).

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Tried the discriminated to enum: it gave a 5% speedup. Unexpected, thanks! Inlining didn't seem to help in any place I tried it... Will try removing the exception and report back. – ttsiodras Jul 22 '11 at 19:29
  • Removing exceptions improved speed by another 16%. Imperative F# is now 60% slower than C#. Anything else I can try? – ttsiodras Jul 22 '11 at 19:45
  • Following a suggestion from @Daniel thru chat, I moved from using "ref" to using "mutable" - and got another 10%. I am feeling positively hopeful - anything else, guys? – ttsiodras Jul 22 '11 at 19:55
  • I was wrong - the speedup was in fact 30% when I moved from "ref" to "mutable" (already commited on GitHub). Wow, imperative F# is now at only 25% distance from C#... I consider it good enough. – ttsiodras Jul 22 '11 at 20:04
12

This isn't an answer, per se, but have you tried writing the exact same code in F# and C#, i.e., imperative F# code? The speed should be similar. If you're comparing terse functional code with heavy use of higher-order functions, sequence expressions, lazy values, complex pattern matching, etc.--all things that allow for shorter, clearer (read, more maintainable) code--well, there is frequently a trade-off. Generally, development/maintenance time is much greater than execution time, so it's usually considered a desirable trade-off.

Some references:
F# and C# 's CLR is same then why is F# faster than C#
C# / F# Performance comparison
https://stackoverflow.com/questions/142985/is-a-program-f-any-more-efficient-execution-wise-than-c

Another point to consider: in a functional language you're working at a higher level and it becomes very easy to overlook the costs of operations. For example, Seq.sort seems innocent enough, but naive use of it can doom performance. I'd recommend poring over your code, asking yourself along the way if you understand the cost of each operation. If you're not feeling reflective, a faster way to do this is, of course, with a profiler.

Community
  • 1
  • 1
Daniel
  • 47,404
  • 11
  • 101
  • 179
  • Actually, I did - if you go to my "Moving to imperative style" section in the blog post, you'll see that once I coded in C# (and got a 2x over F#) I re-wrote in imperative style for both OCaml and F#. F# remained well behind C# (6 seconds vs 3 seconds, a 2x). Not to mention that OCaml had 1.4 and C++ had 0.2 seconds... – ttsiodras Jul 22 '11 at 18:40
  • I'd like to see your imperative F# code. There's no reason it should perform much different than C#. – Daniel Jul 22 '11 at 18:44
  • By all means - as said in the blog post, the code lives in GitHub: https://github.com/ttsiodras/Score4/tree/master/ - navigate in the F# folder, it contains both the imperative and functional solutions. – ttsiodras Jul 22 '11 at 18:48
  • 2
    Sorry, I guess I'm lazy, I'm not reading a _long_ blog post. ;-) Thanks for the link. – Daniel Jul 22 '11 at 18:48
  • OK, the first red flag I see is `printf`. Get rid of it. It uses reflection and is S-L-O-W. – Daniel Jul 22 '11 at 18:51
  • It is only called 7 times... it can't explain the difference in execution time. – ttsiodras Jul 22 '11 at 18:53
  • @ttsiodras let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/1747/discussion-between-daniel-and-ttsiodras) – Daniel Jul 22 '11 at 18:56
  • Your suggestion for moving from "ref" to "mutable" turned out to be the best - thanks! – ttsiodras Jul 22 '11 at 20:08
  • 4
    @ttsiodras: Cool. I'd still recommend profiling until it's on par with C#. There must be a reason for the difference. [Here](http://eqatec.com/Profiler/Download.aspx) is a free one. – Daniel Jul 22 '11 at 20:10
  • In case you care about the final results: if you checkout from GitHub, after many optimizations, I now get: (real 0m0.221s: That was C++) (real 0m0.676s: That was C# imperative) (real 0m0.704s: That was F# imperative) (real 0m0.753s: That was OCaml imperative) (real 0m0.989s: That was OCaml functional) (real 0m1.064s: That was Java imperative) (real 0m1.955s: That was F# functional)... So F# imperative reached C# imperative. Wish I could do something about the functional version, too :-) – ttsiodras Jul 29 '11 at 12:02