16

We are planning to write a highly concurrent application in any of the Very-High Level programming languages.

1) Do Python, Ruby, or Haskell support true multithreading?

2) If a program contains threads, will a Virtual Machine automatically assign work to multiple cores (or to physical CPUs if there is more than 1 CPU on the mainboard)?

True multithreading = multiple independent threads of execution utilize the resources provided by multiple cores (not by only 1 core).

False multithreading = threads emulate multithreaded environments without relying on any native OS capabilities.

psihodelia
  • 29,566
  • 35
  • 108
  • 157
  • 5
    "true" multithreading? What does that mean? Please define "true" multithreading. – S.Lott Dec 17 '09 at 10:47
  • 3
    true multithreading = multiple independent threads of execution utilize the resources provided by multiple cores (not by only 1 core) – psihodelia Dec 17 '09 at 10:56
  • @S.Lott: done, I've updated original post – psihodelia Dec 17 '09 at 13:22
  • 5
    Now that you have updated your question, I am even *more* confused by your distinction between "true" and "false" multithreading. The BEAM Erlang VM, for example, schedules Erlang threads across multiple CPUs and multiple cores. So, according to your definition, BEAM supports true multithreading. But, BEAM *does not* rely on native OS capabilities in any way; in fact, it can actually run without any OS at all, on the bare hardware. Thus, according to your definition, BEAM *does not* support true multithreading, rather it has false multithreading. Which is it? That definition is useless. – Jörg W Mittag Dec 17 '09 at 13:45
  • 5
    The *whole reason* why BEAM is so incredibly scalable, is because it does not rely on the heavyweight, bloated, slow OS threads but implements its own. For example, Linux threads are 4 or 8 KiBytes on 32 Bit machines and 8 or 16 KiBytes on 64 Bit machines. Windows NT threads are 12 KiBytes, I believe. BEAM's are around 300 Bytes. 32 Bit Linux can comfortably handle tens of thousands of threads. Around 800000 it would simply run out of memory, at least on a 32 Bit system. I've seen BEAM running on Linux handle 1 million threads on a not very beefy netbook, while giving a presentation. – Jörg W Mittag Dec 17 '09 at 13:51

8 Answers8

34

1) Do Python, Ruby, or Haskell support true multithreading?

This has nothing to do with the language. It is a question of the hardware (if the machine only has 1 CPU, it is simply physically impossible to execute two instructions at the same time), the Operating System (again, if the OS doesn't support true multithreading, there is nothing you can do) and the language implementation / execution engine.

Unless the language specification explicitly forbids or enforces true multithreading, this has absolutely nothing whatsoever to do with the language.

All the languages that you mention, plus all the languages that have been mentioned in the answers so far, have multiple implementations, some of which support true multithreading, some don't, and some are built on top of other execution engines which might or might not support true multithreading.

Take Ruby, for example. Here are just some of its implementations and their threading models:

  • MRI: green threads, no true multithreading
  • YARV: OS threads, no true multithreading
  • Rubinius: OS threads, true multithreading
  • MacRuby: OS threads, true multithreading
  • JRuby, XRuby: JVM threads, depends on the JVM (if the JVM supports true multithreading, then JRuby/XRuby does, too, if the JVM doesn't, then there's nothing they can do about it)
  • IronRuby, Ruby.NET: just like JRuby, XRuby, but on the CLI instead of on the JVM

See also my answer to another similar question about Ruby. (Note that that answer is more than a year old, and some of it is no longer accurate. Rubinius, for example, uses truly concurrent native threads now, instead of truly concurrent green threads. Also, since then, several new Ruby implementations have emerged, such as BlueRuby, tinyrb, Ruby Go Lightly, Red Sun and SmallRuby.)

Similar for Python:

  • CPython: native threads, no true multithreading
  • PyPy: native threads, depends on the execution engine (PyPy can run natively, or on top of a JVM, or on top of a CLI, or on top of another Python execution engine. Whenever the underlying platform supports true multithreading, PyPy does, too.)
  • Unladen Swallow: native threads, currently no true multithreading, but fix is planned
  • Jython: JVM threads, see JRuby
  • IronPython: CLI threads, see IronRuby

For Haskell, at least the Glorious Glasgow Haskell Compiler supports true multithreading with native threads. I don't know about UHC, LHC, JHC, YHC, HUGS or all the others.

For Erlang, both BEAM and HiPE support true multithreading with green threads.

2) If a program contains threads, will a Virtual Machine automatically assign work to multiple cores (or to physical CPUs if there is more than 1 CPU on the mainboard)?

Again: this depends on the Virtual Machine, the Operating System and the hardware. Also, some of the implementations mentioned above, don't even have Virtual Machines.

Jörg W Mittag
  • 363,080
  • 75
  • 446
  • 653
  • 2
    The reason that CPython and YARV don't have true multithreading, even when run on a multicore processor, and even though they use native threads is because they have a global interpreter lock to prevent a thread from seeing the execution environment (e.g. the set of available class definitions and method calls) in an inconsistent state when some other thread tries to change it. – Ken Bloom Dec 18 '09 at 02:36
  • 2
    YARV calls it the Giant VM Lock (GVL), but in principle you're right. At least on YARV, there are plans to remove the GIL, though. If you look in `thread.c` you can see a description of the three different threading models YARV went through or is going to go through: green threads -> native threads with GIL -> truly parallel native threads with fine-grained locking. However, there's no code yet. In CPython, it's similar: Unladen Swallow plans to remove the GIL, and, if it's not too intrusive, to contribute this removal back to CPython. – Jörg W Mittag Dec 18 '09 at 12:49
22

The Haskell implementation, GHC, supports multiple mechanisms for parallel execution on shared memory multicore. These mechanisms are described in "Runtime Support for Multicore Haskell".

Concretely, the Haskell runtime divides work be N OS threads, distributed over the available compute cores. These N OS threads in turn run M lightweight Haskell threads (sometimes millions of them). In turn, each Haskell thread can take work for a spark queue (there may be billions of sparks). Like so: enter image description here

The runtime schedules work to be executed on separate cores, migrates work, and load balances. The garbage collector is also a parallel one, using each core to collect part of the heap.

Unlike Python or Ruby, there's no global interpreter lock, so for that, and other reasons, GHC is particularly good on mulitcore in comparison, e.g. Haskell v Python on the multicore shootout

igouy
  • 2,547
  • 17
  • 16
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
16

The GHC compiler will run your program on multiple OS threads (and thus multiple cores) if you compile with the -threaded option and then pass +RTS -N<x> -RTS at runtime, where <x> = the number of OS threads you want.

dave4420
  • 46,404
  • 6
  • 118
  • 152
Ganesh Sittampalam
  • 28,821
  • 4
  • 79
  • 98
7

The current version of Ruby 1.9(YARV- C based version) has native threads but has the problem of GIL. As I know Python also has the problem of GIL.

However both Jython and JRuby(mature Java implementations of both Ruby and Python) provide native multithreading, no green threads and no GIL.

Don't know about Haskell.

khelll
  • 23,590
  • 15
  • 91
  • 109
  • Are Jython and JRuby very different from their origins? – psihodelia Dec 17 '09 at 11:05
  • What does GIL mean? Google suggests "Gas Insulated Lines" which is not helpful.... – tobsen Dec 17 '09 at 11:45
  • 2
    http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock explains what the Global interpreter lock (GIL) is. – tobsen Dec 17 '09 at 11:56
  • @psihodelia no they don't differ, they are just written using Java instead of C. If you have some C extensions then you won't be able to use them inside Jython and JRuby, but actually both Jython and JRuby let you use Java jars inside your code, which is a big alternative and a advantage as I can tell. – khelll Dec 17 '09 at 12:02
  • @tobsen, GIL stands for Global Interpreter Lock. – khelll Dec 17 '09 at 12:03
1

Haskell is thread-capable, in addition you get pure functional language - no side effects

Svetlozar Angelov
  • 21,214
  • 6
  • 62
  • 67
  • So what? Being pure makes threading easy because you know functions can run in other threads and won't modify the environment -- Haskell also has Monads which are unpure section of the code, but moreover this doesn't answer the OP question at all. Please remove this answer. – Evan Carroll Dec 20 '09 at 20:22
1

For real concurrency, you probably want to try Erlang.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
  • It seems that Erlanf also doesn't support true multithreading. From Wikipedia: Processes are the primary means to structure an Erlang application. Erlang processes are neither operating system processes nor operating system threads, but lightweight processes somewhat similar to Java's original “green threads”. – psihodelia Dec 17 '09 at 11:02
  • 1
    What does that Wikipedia quote have to do with true multithreading? – Jörg W Mittag Dec 17 '09 at 13:04
  • 3
    Erlang's great at distribution, and has good concurrency abstractions, but concurrency is not the same as parallelism, and Erlang's multicore runtime is both fairly new, and somewhat slower than say, Haskell. Still, a lot better than Python or Ruby in comparison, and you will learn a lot about concurrency, just not multicore parallelism. * http://ghcmutterings.wordpress.com/2009/10/06/parallelism-concurrency/ * http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=all – Don Stewart Dec 18 '09 at 07:42
1

I second the choice of Erlang. Erlang can support distributed highly concurrent programming out of the box. Does not matter whether you callit "multi-threading" or "multi-processing". Two important elements to consider are the level of concurrency and the fact that Erlang processes do not share state.

No shared state among processes is a good thing.

Richie
  • 535
  • 3
  • 3
-2

Haskell is suitable for anything. python has processing module, which (I think - not sure) helps to avoid GIL problems. (so it suitable for anything too).

But my opinion - best way you can do is to select highest level possible language with static type system for big and huge things. Today this languages are: ocaml, haskell, erlang.

If you want to develop small thing - python is good. But when things become bigger - all python benefits are eaten by miriads of tests.

I didn't use ruby. I still thinking that ruby is a toy language. (Or at least there's no reason to teach ruby when you know python - better to read SICP book).

Vasiliy Stavenko
  • 1,174
  • 1
  • 12
  • 29