27

Functional programming has immutable data structures and no side effect which are inherently suitable for parallel programming. I investigate how to exploit multicore computation in a functional language, and target production code for some numerical applications.

F# has Microsoft behind its back, and its parallel constructs such as PLINQ, TPL, Async Workflow have been well-documented and shown some potentials. However, research about parallelism in Haskell is very active at the moment, and it posseses many nice features which haven't been supported by F# yet:

My question is which language I should choose for functional parallelism? If F# is chosen, are there any pointers to build up what they currently have in Haskell?

UPDATE:

I chose Simon's answer because it brought out some nice discussion about garbage collector, memory allocation and cache miss. I will stick to F#, and I think these answers are helpful for me to study functional parallelism.

pad
  • 41,040
  • 7
  • 92
  • 166
  • 15
    GHC has Microsoft behind it as well: most of the principals work for Microsoft Research Cambridge and working on GHC is their job. – geekosaur Mar 30 '11 at 22:00
  • 2
    And the new parallel & deterministic library: http://www.reddit.com/r/haskell/comments/gainp/a_monad_for_deterministic_parallelism_icfp2011/ – Chris Kuklewicz Mar 30 '11 at 22:03
  • 1
    F# isn't pure, for example even though Haskell's STM is ported to F# (http://cs.hubfs.net/blogs/hell_is_other_languages/archive/2008/01/16/4565.aspx) compiler cannot prevent "side-effective" code inside STM. – Ed'ka Mar 30 '11 at 22:19
  • 7
    @geekosaur: Just to clarify, GHC has Microsoft Research behind it, while F# has Microsoft Research as well as Microsoft product team behind it. This is quite a difference - Product team means that there is a commitment for a long-time support for F# as part of (at least) Visual Studio 2010. That's not quite the same thing as the MSR support for Haskell (although the researchers working on Haskell are definitely doing a great job). – Tomas Petricek Mar 31 '11 at 02:05
  • 3
    It's hard to give a simple answer to such a vague question. What exactly are you looking to parallelize? – Dan Burton Mar 31 '11 at 03:11
  • 2
    @Dan: I'm working on numerical algorithms which employ a lot of symbolic representation and require to parallelize operations on tree data structures. – pad Mar 31 '11 at 11:36
  • 2
    You should really include Erlang also in your survey. It is a very clean combination of functional + parallel. For an explanation how it is functional see http://stackoverflow.com/questions/2271417/is-erlang-really-a-functional-language/2271574#2271574 – Peer Stritzinger Mar 31 '11 at 15:54
  • This question does not really have any objective answer. It is the same as asking "What is the best language?" or "Is language A better than language B?". Different people will have different answers. Choose whatever language that fits you. Voting to close. – MAK Mar 31 '11 at 17:45
  • The very last question "If F# is chosen, are there any pointers to build up what they currently have in Haskell?" sounds valid to me. Personally, I don't think this should be closed. – Tim Perry Mar 31 '11 at 22:38
  • @tim-perry: That's what I want to ask too. I wonder how to bring such nice parallelism constructs of Haskell into F#. – pad Apr 01 '11 at 11:02
  • 1
    @pad: If you're serious about multicore parallelism then you need to forget about Haskell and study how real parallel programs are written. I explained this in detail in my answer but it got downvoted into oblivion in under 9 hours! – J D Apr 09 '11 at 09:00
  • @Jon: you seem to be against functional parallelism. Is there any advantage of functional paradigm over imperative paradigm in context of parallel programming in your point of view? – pad Apr 09 '11 at 20:32
  • 1
    @pad: On the contrary, I am all for researching functional parallelism properly (i.e. seeking peer review from experts in *parallel* programming and not just functional programming). My point is that current understanding of this subject is full of holes and the implementation is virtually untested so there's no way I'd build upon these ideas in production code. – J D Apr 12 '11 at 23:35

6 Answers6

52

If the kind of code you have in mind allocates memory heavily, then you might find that the GHC garbage collector scales better than the .NET garbage collector. There's some anedcodal evidence that the .NET GC becomes a bottleneck when multiple threads are allocating heavily, and this is also a thorn in the side of most Java collectors too. On the other hand we've paid quite a lot of attention to achieving good locality and scalability in the GHC garbage collector - mainly because we have no choice, most idiomatic Haskell code allocates heavily anyway. I have benchmarks that allocate like crazy and keeping scaling beyond 24 cores.

In Haskell note that you get a guarantee of determinism from the type system, which you don't get in F#.

You mentioned Data Parallel Haskell: a cautionary note here, it isn't ready for production use at the present time, although the DPH team are expecting that the forthcoming GHC 7.2.1 release will have a stable DPH implementation.

Community
  • 1
  • 1
Simon Marlow
  • 12,785
  • 4
  • 42
  • 32
  • 1
    Thank you for the caution about DPH. It looks really impressive and helps to reduce much effort for developers though. – pad Mar 31 '11 at 11:40
  • 3
    You've pointed out a very important aspect of parallel programming. I kind of see a similar problem of .NET GC here http://stackoverflow.com/questions/2311154/how-can-i-improve-garbage-collector-performance-of-net-4-0-in-highly-concurrent/2311171#2311171. I think garbage collection is critical in context of functional parallelism where memory allocation is scattered over the program. From a developer's perspective, what are your suggestions to avoid such problems? – pad Apr 04 '11 at 11:24
  • 1
    @pad: The solution is to avoid allocating by mutating in-place. Writing scalable parallel programs for multicores in a purely functional style is an unsolved problem. Nobody has even studied the cache complexity of purely functional data structures and algorithms yet... – J D Apr 08 '11 at 22:46
  • 3
    @Jon: Mutating in-place only doesn't solve the problem. As you mentioned cache oblivious data structures, the situation is pretty clear for arrays and matrices, but how are you gonna organize recursive data structures (trees) in memory to reduce cache misses? – pad Apr 09 '11 at 20:25
  • 2
    @pad: I'll to defer to the significant body of work on cache oblivious trees. The state-of-the-art is to adopt van Emde Boas layout but there are other kinds of cache oblivious trees and other tree-based cache oblivious data structures like the Funnel heap. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.89.2955&rep=rep1&type=pdf http://www.mpi-inf.mpg.de/departments/d1/teaching/ws10/models_of_computation/FunnelHeapTalk.pdf http://supertech.csail.mit.edu/cacheObliviousBTree.html – J D Apr 10 '11 at 20:07
  • 11
    @Jon avoiding allocation is not a "solution". If your language implementation forces you to avoid allocation in order to get parallel speedup, that is a serious limitation. Maybe you should consider a different language :-) – Simon Marlow Apr 11 '11 at 15:23
  • 9
    @SimonMarlow: Degrading absolute performance is not a "solution". If your language implementation forces you to sacrifice performance in order to get parallel speedup, that is a serious limitation. – J D Apr 12 '11 at 23:19
  • 8
    Indeed, it's pretty trivial to get good parallel speedup if you sacrifice enough performance that your serial version is sufficiently slow. – Jonathan Dursi May 04 '11 at 02:23
22

First of all, I agree with others that there is no objective answer.

However, I think that the idea of functional parallelism is a bit overrated. Surely, you can easily find data dependencies in your program and if you're processing lots of data, you can use some data-parallel library to easily and safely parallelize it. However, this can be done even in C# (using TPL and PLINQ) if you're a bit careful about what you're writing.

The problem is, that most of the programs don't need to be parallelized, because they simply don't do enough CPU-intensive work. For example, F# async solves (I think) more important problem of enabling asynchronous I/O, which is the reason for most "hangs" in connected applications. I think the popularity of Node.js demonstrates this importance quite nicely.

The real value of functional languages is in the expressivity of the language - you can easily define abstractions for your problem, write code in a more succinct way that is easier to understand, reason about and test. You get this in both F# and Haskell.

To answer your specific question about parallelism - I believe that the status of parallelism support in F# is more stable (but then, I'm an F# person). You can choose between async, TPL and (Erlang-inspired) F# agents (which are all quite stable libraries). On the Haskell side, there is still a lot of evolution going on. The most recent work is just few weeks old. I also find it easier to use parallelism in a language with clearly specified evaluation model, but that may be just my personal preference.

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • 3
    Being a complete newbie using your F# book (which is excellent so far BTW) I would add that the elegance of the solutions in functional languages (F#, Haskell) are incomparable to the imperative ones. – Marcote Mar 31 '11 at 02:09
  • 6
    F# doesn't have STM though, right? That is a major load off the brain when writing certain kinds of concurrent code. – luqui Mar 31 '11 at 02:54
  • 12
    I think you're understating the value here. In a functional language it's much more likely that you can start with an idiomatically written program and parallelise it without mangling the code too much, or equivalently that you can write a parallel program that doesn't look too different from the idiomatic sequental one. Furthermore you're much more likely to get it right (fewer hazards to trip you up). It's not about absolute performance: it's about the tradeoff between performance, effort, maintainability, and robustness. – Simon Marlow Mar 31 '11 at 07:11
  • Yes, F# doesn't have STM. My understanding is that people often view STM more as a low-level technique for implementing more high-level concurrency mechanism. In F#, you can better use "usual" imperative style and there are (for example) many already implemented thread-safe collections and primitives: http://msdn.microsoft.com/en-us/library/dd997305.aspx – Tomas Petricek Mar 31 '11 at 10:46
  • 2
    I suspect that in a language with pervasive mutation, STM is somewhat low level -- you're either locking everything or only able to synchronize a small number of things directly. In a language where most things are immutable, then there's less of an issue -- TVars and TChans provide a higher level interface than otherwise. – sclv Mar 31 '11 at 11:30
  • I agree that the expressiveness of a functional language helps for parallelism. If the design phase does not reveal all concurrency, some concurrency can be directly seen from the code later on, which is really amazing. I have the impression that Microsoft is pushing multicore computing so hard, officially with F# and .NET framework, unofficially with research on Haskell parallelism; it shows positive points for parallelism support in the future. – pad Mar 31 '11 at 11:52
  • @Simon Marlow: Cilk is an obvious counter example. – J D Apr 09 '11 at 01:37
  • @luqui: Can you give a concrete example where STM is a "major load off the brain"? – J D Apr 09 '11 at 01:38
  • @pad: Code rarely reveals concurrency and I doubt purely functional code does any more than the next paradigm. For example, immutable singly-linked lists are ubiquitous in Haskell but the worst possible data structure for parallel programming. The only viable solution is to completely redesign your algorithms and data structures. – J D Apr 09 '11 at 01:46
  • @Jon, http://stackoverflow.com/questions/5518546/software-transactional-memory-composability-example – luqui Apr 09 '11 at 15:12
  • @luqui: I'm not sure STM is a "major load off the brain" there compared to sorting the locks before acquiring them or any lock-free alternatives. – J D Apr 09 '11 at 17:47
  • @Jon, depends on the code you are writing. There are a lot of approaches appropriate for a lot of situations. STM is one that is available in Haskell and not F#. I think it is a pretty natural way to write concurrent code, but it certainly won't work best for every problem -- neither will any of the alternatives. – luqui Apr 09 '11 at 17:57
  • 1
    +1 for "The real value of functional languages is in the expressivity of the language - you can easily define abstractions for your problem" – Yin Zhu Apr 10 '11 at 13:17
  • 1
    @luqui: I'm curious because I've never come across a problem that had me reaching for an STM implementation. In particular, I don't understand how STM could be useful in the context of parallelism because it looks extremely inefficient to me. Wouldn't passing a message to an agent that performed all transfers in series be better in every respect? – J D Apr 10 '11 at 20:16
  • 3
    @Jon, I'm not arguing that STM is the best solution for the problem in the question I linked. A nice example of where STM hits its domain is if you want a large balanced tree that can be updated in parallel. Anyway, your argument is reminiscent of novice programmers when they learn about closures. "OK, but I have never wanted them, and I can't possibly see how they would be useful." It's a *new abstraction*, and you have to get familiar and comfortable with it before uses start jumping out at you. – luqui Apr 10 '11 at 20:44
  • 1
    @luqui: I'm just looking for any compelling examples. And STM is not a "new abstraction", it was proposed 11 years ago! Concurrent updates to balanced trees are exactly the kind of thing I'm talking about: it is a fundamentally pointless thing to do because a serial implementation will be faster and simpler and efficient parallelism requires a non-tree data structure. http://www.cl.cam.ac.uk/research/srg/netos/lock-free/ – J D Apr 10 '11 at 23:26
  • 1
    @Jon, "it is a fundamentally pointless thing to do because a serial implementation will be faster and simpler and efficient parallelism requires a non-tree data structure". Can you give evidence for that claim? The page you linked to seems to be a bunch of papers detailing different STM implementations. – luqui Apr 11 '11 at 01:48
  • @Jon, and I meant "new" in the sense of an unfamiliar concept to a given programmer, not new research in the software industry. (Thus my use of closures as an example; the concept of closures has been around since before the advent of computers!) – luqui Apr 11 '11 at 01:50
  • 2
    @luqui: Some of the work I cited gives performance results for concurrent balanced trees and in all cases they do not scale because even non-local updates end up contending for nodes near the root. So it is much more complicated but not really any better than serialization. In contrast, concurrent hash tables and skip lists provide the same functionality and much better performance. – J D Apr 13 '11 at 10:45
22

I'm going to get downvoted for this, but let me be the curmudgeon.

Functional languages are great. They change the way you think about decomposing problems, and they map incredibly well to certain kinds of problems. Every programmer should be familar with at least one functional programming language. But "functional languages are inherently good for parallel programming" is probably not the reason why.

It's worth noting that what is unquestionably the most successful parallel functional language of all time, Erlang, uses completely bog-standard message passing to implement its parallelism, and the connection between its functional nature and its parallelism is indirect at best.

Twenty-five years ago, there was a huge push for functional languages, because the argument then seemed very compelling as well -- functional languages seemed a natural fit for the increasingly parallel architectures of the time. The argument was that compilers and runtimes would automatically be able to implement parallelism, due to the side-effect free nature of the languages. SISAL, which even then could be compiled into shared- and distributed- memory (!) executables was developed at this time, as was Haskell, as was ML, the the predecessor to Objective CAML and the other lanaguages in the ML family.

This is all just offered a bit of historical perspective. For literally a quarter of a century, functional-language advocates, including some of the brightest minds in the field, have been saying that functional languages' day in the sun was just around the corner, and that it's going to the be applicability to parallelism that is the killer app. And yet, here we are, with no one even having heard of SISAL; and I'm guessing that most readers of this post think of Haskell as a hot new language.

It could well be, of course, that now with multicore considerations things have finally gotten so urgent that functional languages are really going to shine, or that this year will be the year that there's some breakthrough I can't even imagine which completely changes the landscape. This year could well be different than each of the 25 previous years. But it might not, too.

The vast, vast, vast majority of parallel and concurrent code extant now, and well into the forseeable future, is not written in functional languages. If you're looking to learn about parallelism, by all means explore the mechanisms available in F#, Haskell etc; but don't limit yourself to those, is all I'm saying.

Jonathan Dursi
  • 50,107
  • 9
  • 127
  • 158
  • 8
    Simon Peyton Jones dealt with this point head-on in his recent Function Programming eXchange talk, [Managing parallelism: embrace diversity, but control side effects](http://skillsmatter.com/podcast/scala/talk-by-haskell-expert-simon-peyton-jones/js-1434). – Simon Marlow Mar 31 '11 at 06:56
  • Thank you for the nice story. I don't intend to limit myself with F# or Haskell. Because parallel programming is quite new to me, I'm a little bit confused with good parallelism support from both F# and Haskell community. I raise this question not for an objective answer, it's more for discussion about F# and Haskell parallelism. After your and @simon-marlow 's comment, I start to think that parallelism paradigm should not be absolute, a careful mixture between functional parallelism and imperative parallelism could be a good idea. – pad Mar 31 '11 at 12:03
  • 11
    The killer argument for functional languages isn't parallelism, its the factor of between 4 and 10 reduction in the size of the code. The thing stopping functional languages in the past was that they couldn't compile to efficient code. That is no longer true. – Paul Johnson Mar 31 '11 at 17:39
  • 9
    Just to clarify, Erlang is a language for concurrent programming, not parallelism (i.e. parallel speedup isn't the focus) – Don Stewart Apr 03 '11 at 21:50
  • 2
    Well, ultimately it's the programmers who decide what a language is for, and googling "erlang" "scalability" gives 400k+ hits; so a lot of people _are_ using it for it's ability to make use of lots of processors. (Eg, weak scaling, as vs strong-scaling -- but parallel performance nonetheless). And performance has _always_ been important to Erlang - the real-time telco stuff _does_ require efficient use of whatever hardware is available. So while I agree with the distinction you're making, I'm not sure it's true that Erlang is "for" concurrent vs parallel programming. – Jonathan Dursi Apr 04 '11 at 03:02
12

The simple answer is that because both languages have solid support for parallelism and concurrency, this shouldn't be a factor in your decision on which language to use. I.e., there are much larger factors to take into consideration for a decision like that.

ildjarn
  • 62,044
  • 9
  • 127
  • 211
8

There's no objective answer. For Haskell, there's a large body of active work, and no "one size fits all" approach. Instead, in Haskell, many different tools for achieving parallelism are provided. What's the status of multicore programming in Haskell?

Community
  • 1
  • 1
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
  • 1
    Cilk style multicore programming seems to be the one size that fits all. Consequently, both Intel and Microsoft adopted its design for their TBB and TPL products. – J D Apr 10 '11 at 23:23
6

Haskell's purity means that it makes a clear distinction between parallel processing and concurrency.

  • If you are looking to speed up your big data-crunching application by distributing the work over multiple cores then you want parallel processing, which means "par" and its derivatives. By careful use of these constructs you can have your CPU-intensive pure function run N times faster on N cores whilst being sure that you haven't changed the meaning of the original code or introduced non-determinism into your program.

  • On the other hand if you want your program to interact with multiple entities in the outside world, interleaving communications from different entities but still having some degree of shared resources, then you want concurrency, which means using "fork" and some combination of STM and TVars. STM gives you nice transactional semantics, which goes a long way towards eliminating race conditions and other nasties of concurrency. You do need to pay attention to collision frequency and retry rates though.

Paul Johnson
  • 17,438
  • 3
  • 42
  • 59
  • 1
    This answer seems to be born out of a misconception about what parallelism and concurrency are. – J D Apr 08 '11 at 22:57
  • 2
    @Jon, This answer accurately reflects the accepted definitions of those two terms in the Haskell community. Are the definitions different in other communities? – luqui Apr 09 '11 at 17:45
  • 6
    Yes, there is much confusion about these terms. Don Syme recently pointed out that fetching URLs at the same time (e.g. using `Async.Parallel`) is parallelism and not concurrency although it is not CPU bound. Concurrent garbage collection is an example of a CPU-bound application that is concurrent but not (necessarily) parallel. I particularly liked DMBarbour's statement correcting Simon Marlow on his blog that enumerated some deterministic forms of concurrency such as – J D Apr 10 '11 at 20:27
  • 5
    temporal logic programming, functional reactive programming, future passing with single assignment variables, Kahn process networks, concurrent constraint programming. And I liked his definitions of parallelism as "an implementation property that says many calculations are happening at the same time" and concurrency as "a semantic property that says many behaviors are happening at the same time". – J D Apr 10 '11 at 20:30