24

Here are slides by Rob Pike on this. Every time I go thru this I feel like a moron. I'm not able to to figure out the gist of it. It's well understood that concurrency is decomposition of a complex problem into smaller components. If you cannot correctly divide something into smaller parts, it's hard to solve it using concurrency.

But there isn't much detail in slides on how to get parallelism once you've achieved concurrency. In the Lesson slide (num 52), he says Concurrency - "Maybe Even Parallel". But the question is - When and How can concurrency correctly and efficiently lead to Parallelism?

My guess is that, under the hood Rob's pointing out that developers should work at the level of concurrency - and parallelism should be language's/vm's concern (gomaxprocs?). Just care about intelligent decomposition into smaller units, concerned only about correct concurrency - parallelism will be take care by the "system".

Please shed some light.

casperOne
  • 73,706
  • 19
  • 184
  • 253
haps10
  • 3,496
  • 6
  • 32
  • 38
  • There's a bit too much scope for open ended discussion here. I've asked the Programmers.SE mods if this will work over there. Thanks. – Kev Jul 28 '12 at 15:16

1 Answers1

7

What Rob Pike means

When you have the abstract form of an algorithm in mind, you then have to choose if you will implement it with Message Passing or Shared Memory or maybe Hybrid. You will also have to consider the type of memory access (NUMA, UMA, etc) and the Topology used (Hypercube, Torus, Ring, Mesh, Tree, etc)

This seems a lot of work to someone who just wants something, maybe even simple, done in a parallel way (e.g. parallel for).

And it is a lot of work especially if you change the topology (so you can have all of its advantages).

So you write the parallel code (be it simple or complex) and the VM or compiler will choose what seems to be the best way to go, even running it in a sequential way! (an example would be Task Parallel Library for .net)

Important EDIT:

I should mention that I am talking about concurrency in a program / algorithm and not between independent programs that run in a system.

You said that

It's well understood that concurrency is decomposition of a complex problem into smaller components. If you cannot correctly divide something into smaller parts, it's hard to solve it using concurrency

but it is wrong b/c those smaller components may depend on each other in a sequential manner to complete, so even if you divide into small components, it does not mean you achieve concurrency / parallelism.

In all my classes of parallel and distributed algorithms (both in BS and MS) we never talked about "concurrency we obtained and now let's see how to obtain parallelism". If you use the word concurrency to describe and algorithm then you imply parallelism and vice versa.

In the literature you will also find a thin line between distributed and parallel.

From an algorithmic point of view you can use concurrency, parallelism and distributed and you get the same idea.

From an implementation point of view, if you say "parallelism" you usually intend a program that runs on the local computer or a cluster (Shared Memory communication), and "distributed" when you run the program on a grid (Message Passing communication).

Now, both distributed and parallelism imply concurrency.

I think you should be more skeptical about the precise meaning of these terms because even in the literature (and I talk about people that actually contributed to this field and not just the creation of some language) they are used to express the abstract concept.

Concurrency on an algorithm (be it program) means to have pieces of code that can run independent of other pieces of code, even if they will eventually wait for some other pieces of code (check Amdahl's Law to see exactly the implication of this).

So whenever you have concurrency in an algorithm / program you also have parallelism.

I think it is better to just implement some parallel AND distributed algorithms to better understand the idea behind it. If you know C/C++ you can use OpenMPI for distributed (Message Passing) implementations and OpenMP for parallel (Shared Memory) implementations.

EDIT:

He could also mean concurrency as the abstract principle and parallel as the way it is implemented [Shared Memory, Message Passing, Hybrid between both; Type of memory acces (numa, uma, etc)].

amb
  • 1,599
  • 2
  • 11
  • 18
  • 2
    I think what the point Rob Pike is saying is about "processing parallelism" - multiple threads running at same time on multicore processors. – Marcelo De Zen Jul 28 '12 at 13:08
  • 1
    I have covered this in the EDIT part :). – amb Jul 28 '12 at 13:20
  • Amdahl's law talk about parallelism when you have more than 1 core/processor involved. http://en.wikipedia.org/wiki/Amdahl's_law – Marcelo De Zen Jul 28 '12 at 13:37
  • @devundef well you can't talk about parallelism / concurrency / distributed algorithm if you only have **one** single processor (**as in 1 single ALU**) at hand. So what are you saying ? – amb Jul 28 '12 at 13:43
  • 5
    Seems like you have too much reading and none practice. You should dedicate about 15 minutes to read again what concurrency is. You're misleading some basics. You get concurrency on a single core processor using preemptive time-shared threads. What you cannot achieve on a single core processor is parallelism. got it? – Marcelo De Zen Jul 29 '12 at 13:32
  • If you can, please respond the question from @haps10: When and How can concurrency correctly and efficiently lead to Parallelism? – Marcelo De Zen Jul 29 '12 at 14:13
  • 1
    @devundef I am using the notions **only regarding to algorithms (an their implementations)**. And I have enough practice with implementing parallel and distributed algorithms using OpenMP and OpenMPI. Don't confuse the notions regarding an algorithm and those regarding from an OS view. This is way I feel concurrency in the context of algorithms (and their implementions) is such a bad choice of a word b/c people will start to think on the OS level and not specifically to that program. – amb Jul 29 '12 at 14:44
  • @devundef I have already responded in the "what rob pike means". The simple idea is that when you implement parallel and distributed algorithms you start to see beyond threads, semaphores, mutex and so on because you confront with a lot of (new) choices. – amb Jul 29 '12 at 14:53