19

As the title says, how do two or more threads share memory on the heap that they have allocated? I've been thinking about it and I can't figure out how they can do it. Here is my understanding of the process, presumably I am wrong somewhere.

Any thread can add or remove a given number of bytes on the heap by making a system call which returns a pointer to this data, presumably by writing to a register which the thread can then copy to the stack. So two threads A and B can allocate as much memory as they want. But I don't see how thread A could know where the memory that thread B has allocated is located. Nor do I know how either thread could know where the other thread's stack is located. Multi-threaded programs share the heap and, I believe, can access one another's stack but I can't figure out how.

I tried searching for this question but only found language specific versions that abstract away the details.

Edit: I am trying not to be language or OS specific but I am using Linux and am looking at it from a low level perspective, assembly I guess.

trincot
  • 317,000
  • 35
  • 244
  • 286
Lemma Prism
  • 371
  • 1
  • 2
  • 8
  • possible duplicate of [Do threads share the heap?](http://stackoverflow.com/questions/1665419/do-threads-share-the-heap) – Tomas Voracek Aug 10 '12 at 21:09
  • 2
    No, I don't think so. I saw that one while searching and it doesn't ask how threads share the heap, only if they do. I want to know precisely how threads share data. What is the mechanism of communication? I think they share pointers to the allocated memory but I don't know how they do it. – Lemma Prism Aug 10 '12 at 21:46
  • How can I edit my question for best clarity? What is most confusing about it? Usr has answered my question but I want to make sure my question is understandable to others and right now it doesn't seem very clean to me. – Lemma Prism Aug 10 '12 at 23:51

5 Answers5

13

My interpretation of your question: How can thread A get to know a pointer to the memory B is using? How can they exchange data?

Answer: They usually start with a common pointer to a common memory area. That allows them to exchange other data including pointers to other data with each other.

Example:

  1. Main thread allocates some shared memory and stores its location in p
  2. Main thread starts two worker threads, passing the pointer p to them
  3. The workers can now use p and work on the data pointed to by p

And in a real language (C#) it looks like this:

//start function ThreadProc and pass someData to it
new Thread(ThreadProc).Start(someData)

Threads usually do not access each others stack. Everything starts from one pointer passed to the thread procedure.


Creating a thread is an OS function. It works like this:

  1. The application calls the OS using the standard ABI/API
  2. The OS allocates stack memory and internal data structures
  3. The OS "forges" the first stack frame: It sets the instruction pointer to ThreadProc and "pushes" someData onto the stack. I say "forge" because this first stack frame does not arise naturally but is created by the OS artificially.
  4. The OS schedules the thread. ThreadProc does not know it has been setup on a fresh stack. All it knows is that someData is at the usual stack position where it would expect it.

And that is how someData arrives in ThreadProc. This is the way the first, initial data item is shared. Steps 1-3 are executed synchronously by the parent thread. 4 happens on the child thread.

usr
  • 168,620
  • 35
  • 240
  • 369
  • How does the main thread pass pointer p to the worker threads? Are the worker threads duplicates of the main thread? I read that is how the creation of new processes works and would certainly explain how it would pass data, if that is how it works, though don't know how they would differentiate. I am looking at this closer to an assembly perspective, not a higher level language. – Lemma Prism Aug 10 '12 at 22:41
  • I added a response to your comment. – usr Aug 10 '12 at 22:47
  • Ok, so the mechanism by which multiple threads can communicate is initiated by pushing predefined data (usually pointers I imagine) onto the thread's stack during thread creation via the OS function's arguments. Thanks! This answers my question but I will wait a day or two before actually accepting an answer, in case someone writes a better or clearer response. This is sufficient for my purposes, but out of curiosity can you give the OS function even more data to push onto the stack? Also, is this an OS specific mechanism or does Windows or Linux do it differently? – Lemma Prism Aug 10 '12 at 23:44
  • This is theoretically OS specific but in practice this is the only way.; One pointer is always enough because you can have that pointer point to an arbitrary struct containing everything. A thread-pool thread might receive a ptr to a (synchronized) work queue. The user-mode developer can put arbitrary wrappers on top of that simple API. C# allows you to even pass in a closure object implicitly. That is a pretty mighty wrapper hiding everything that I just described. Example: `{ int x = 0; new Thread(() => { x++; }); }` Just works and all built on top of that simple API. – usr Aug 10 '12 at 23:48
  • One pointer is enough, yes, but _can_ you add more data? I am just curious if operating systems allow it. In practice it really doesn't matter either way since, like you said, one is enough. – Lemma Prism Aug 10 '12 at 23:57
  • No you can't. See http://msdn.microsoft.com/en-us/library/bb202727.aspx (lpvThreadParam); http://linux.die.net/man/3/pthread_create (arg) – usr Aug 11 '12 at 00:05
2

A really short answer from a bird's view (1000 miles above):
Threads are execution paths of the same process, and the heap actually belongs to the process (and as a result shared by the threads). Each threads just needs its own stack to function as a separate unit of work.

Cratylus
  • 52,998
  • 69
  • 209
  • 339
0

Threads can share memory on a heap if they both use the same heap. By default most languages/frameworks have a single default heap that code can use to allocate memory from the heap. In unmanaged languages you generally make explicit calls to allocate heap memory. In C, that might be malloc, etc. for example. In managed languages heap allocation is usually automatic and how allocation is done depends on the language--usually through the use of the new operator. but, that depends slightly on context. If you provide the OS or language context you're asking about, I might be able to provide more detail.

Peter Ritchie
  • 35,463
  • 9
  • 80
  • 98
0

A Thread shared with other threads belonging to the same process: its code section, data section and other operating system resources such as open files and signals.

Saurabh Juneja
  • 1,187
  • 1
  • 8
  • 12
0

The part you are missing is static memory containing static variables.

This memory is allocated when the program is started, and assigned known adresses (determined at the linking time). All threads can access this memory without exchanging any data runtime, because the addresses are effectively hardcoded.

A simple example might look like this:

// Global variable.
std::atomic<int> common_var;

void thread1() {
  common_var = compute_some_value();
}

void thread2() {
  do_something();
  int current_value = common_var;
  do_more();
}

And of course the global value may be a pointer, that can be used to exchange heap memory. The producer allocates some objects, the consumer takes and uses them.

// Global variable.
std::atomic<bool> produced;
SomeData* data_pointer;

void producer_thread() {
  while (true) {
    if (!produced) {
      SomeData* new_data = new SomeData();
      data_pointer = new_data;
      // Let the other thread know there is something to read.
      produced = true;
    }
  }
}

void consumer_thread() {
  while (true) {
    if (produced) {
      SomeData* my_data = data_pointer;
      data_pointer = nullptr;
      // Let the other thread know we took the data.
      produced = false;
      do_something_with(my_data);
      delete my_data;
    }
  }
}

Please note: these are not examples of good concurrent code, but they show the general idea without too much clutter.

Frax
  • 5,015
  • 2
  • 17
  • 19