5

Do global pointers have a scope that exist between threads?

For instance, suppose I have two files, file1.c and file2.c:

file1.c:

uint64_t *g_ptr = NULL;

modify_ptr(&g_ptr) { 
    //code to modify g_ptr to point to a valid address 
}

read_from_addr() {
    //code which uses g_ptr to read values from the memory it's pointing to
}

file2.c:

function2A() {
    read_from_addr();
}

So I have threadA which runs through file1.c and executes modify_ptr(&g_ptr) and also read_from_addr(). And then threadB runs, and it runs through file2.c executing function2A().

My question is: Does threadB see that g_ptr is modified? Or does it still see that it's pointing to NULL?

If that's not the case, what does it mean for a pointer to be global? And how do I ensure that this pointer is accessible between different threads?

Please let me know if I need to clarify anything. Thanks

OfLettersAndNumbers
  • 822
  • 1
  • 12
  • 22
  • you would need to declare the pointer as `volatile` to see immediate updates in different threads on the pointer – thumbmunkeys Sep 03 '13 at 18:24
  • Two words: "synchronization" and "`volatile`". – cHao Sep 03 '13 at 18:24
  • @Joe: It does, however, prevent optimizations that would cache the value and reuse it without checking it again. Which is an important part of making the new value visible. – cHao Sep 03 '13 at 18:26
  • Agreed: http://stackoverflow.com/questions/4557979/when-to-use-volatile-with-multi-threading/4558031#4558031 – Joe Sep 03 '13 at 18:27
  • 3
    `volatile` does not provide memory consistency guarantees. – user7116 Sep 03 '13 at 18:41
  • 3
    Volatile is useless here, there must be some synchronisation, and the synchronisation itself will ensure that the global data is not cached in one thread while it is modified by the other thread. see http://stackoverflow.com/questions/2484980/why-is-volatile-not-considered-useful-in-multithreaded-c-or-c-programming – Étienne Sep 03 '13 at 18:43
  • By the way "modify_ptr(&g_ptr) { " is not proper C, did you mean "modify_ptr(uint64_t *pointer) { " ? – Étienne Sep 03 '13 at 18:49
  • @Étienne: Synchronization won't keep the compiler from optimizing away the rechecking of a value. That's precisely what `volatile` is for. – cHao Sep 03 '13 at 18:59
  • @cHao: See http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/, and http://stackoverflow.com/questions/2478397/atomic-swap-in-gnu-c/2478520#2478520 – Étienne Sep 03 '13 at 19:00
  • 1
    @Étienne: See that first link yourself -- particularly the second useful case. That's precisely what we have here; the other thread is an "external agent". That `volatile` doesn't guarantee anything about the visibility of the new value is irrelevant. The point is that it forces the compiler to assume the value can change, so something like `while (g_ptr);` doesn't loop forever when you enable optimization. – cHao Sep 03 '13 at 19:01
  • @cHao: It can not loop forever if you have acquired a mutex on it, the other thread has to wait for your thread containing `while(g_ptr)` to release the mutex to be able to modify it. The memory barrier provided by a mutex prevents the compiler from reordering, making volatile completely useless. – Étienne Sep 03 '13 at 19:14
  • @Étienne: Reordering is irrelevant with a value the code has cached. And a mutex doesn't prevent code from caching the value. A memory fence only keeps the *hardware* from caching; it doesn't prevent the compiler from generating a `mov esi, [g_ptr]` and using `esi` wherever it'd otherwise use `[g_ptr]`. – cHao Sep 03 '13 at 19:18
  • @Étienne: Even with a mutex there is no guarantee that `g_ptr` has correct values since that compiler may have the value cached in some registers. – bkausbk Sep 03 '13 at 19:22
  • @cHao, @bkausbk: Ok, indeed if you don't acquire a mutex to read the global variable and do `while(g_ptr)` without using any function in the loop the value of `g_ptr` could be cached if you don't declare it volatile. But if you lock a mutex to read the `g_ptr` value its value can not be cached by the compiler. – Étienne Sep 03 '13 at 19:43
  • @Étienne: Consider that until C11, the definition of the abstract machine did not even contemplate multithreading. Under every prior standard, the compiler was free to assume that any object that it knows the current thread didn't modify, *didn't change*, unless that object was accessed directly through a `volatile` variable. Mutexes and memory fences be damned; they weren't even contemplated til C11. And as for C11, i can't assume that has changed til i've seen where the spec says so. – cHao Sep 03 '13 at 20:03
  • @cHao Acquiring a mutex to read the variable forbids the compiler to assume it didn't change since the mutex provides a memory barrier. I don't get your point since even if you declare the variable volatile you still need some kind of mutex because read and writes are not atomic. Once you use the mutex volatile is useless. – Étienne Sep 03 '13 at 20:21
  • @Étienne: The act of acquiring a mutex does no such thing prior to C11. The very concept of a "memory barrier" did not exist in standard C til then. – cHao Sep 03 '13 at 20:24
  • @cHao: Mutexes implementing memory barriers existed long before C11, for example posix mutexes: http://stackoverflow.com/questions/3208060/does-guarding-a-variable-with-a-pthread-mutex-guarantee-its-also-not-cached – Étienne Sep 03 '13 at 20:30
  • @Étienne: POSIX is not C. Whatever it defines is over and above what C defines, and is not binding on a compiler that does not claim POSIX conformance. In particular, it's not binding on any compiler for a non-POSIX OS, and does not force reloads of a value unless the compiler promises as much. – cHao Sep 03 '13 at 20:44
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/36746/discussion-between-etienne-and-chao) – Étienne Sep 03 '13 at 20:58

4 Answers4

7

My question is: Does threadB see that g_ptr is modified? Or does it still see that it's pointing to NULL?

Maybe. If accessed without any sort of external synchronization, you're likely to see bizarre, highly non-reproducible results -- in certain cases, the compiler may make certain optimizations based on its analysis of your code which can stem from assuming that a variable is not modified during certain code paths. For example, consider this code:

// Global variable
int global = 0;

// Thread 1 runs this code:
while (global == 0)
{
    // Do nothing
}

// Thread 2 at some point does this:
global = 1;

In this case, the compiler can see that global is not modified inside the while loop, and it doesn't call any external functions, so it can "optimize" it into something like this:

if (global == 0)
{
    while (1)
    {
        // Do nothing
    }
}

Adding the volatile keyword to the declaration of the variable prevents the compiler from making this optimization, but this was not the intended use case of volatile when the C language was standardized. Adding volatile here will only slow down your program in small ways and mask the real problem -- lack of proper synchronization.

The proper way to manage global variables that need to be accessed simultaneously from multiple threads is to use mutexes to protect them1. For example, here's a simple implementation of modify_ptr using a POSIX threads mutex:

uint64_t *g_ptr = NULL;
pthread_mutex_t g_ptr_mutex = PTHREAD_MUTEX_INITIALIZER;

void modify_ptr(uint64_t **ptr, pthread_mutex_t *mutex)
{
    // Lock the mutex, assign the pointer to a new value, then unlock the mutex
    pthread_mutex_lock(mutex);
    *ptr = ...;
    pthread_mutex_unlock(mutex);
}

void read_from_addr()
{
    modify_ptr(&g_ptr, &g_ptr_mutex);
}

Mutex functions ensure that the proper memory barriers are inserted, so any changes made to a variable protected by a mutex will be properly propagated to other CPU cores, provided that every access of the variable (including reads!) is protected by the mutex.

1) You can also use specialized lock-free data structures, but those are an advanced technique and are very easy to get wrong

Adam Rosenfield
  • 390,455
  • 97
  • 512
  • 589
  • 3
    +1 Nice answer, and thank you for hopefully quieting the use-volatile-for-thread-synchro ideologues (at least for now, anyway). – WhozCraig Sep 03 '13 at 20:24
  • 1
    @WhozCraig: Sorry, i was away for a bit. :P I didn't say use `volatile` for synchronization. Ever. Not once. What i did say was to use it to keep the compiler from getting too clever for its own good. Even this answer admits that it does as much, and my only beef with it (and everyone parroting the volatile-is-the-devil party line) is that *not one single person in the whole cult* has yet to provide conclusive evidence that a mutex actually does (or even promises to do) `volatile`'s job consistently and reliably in all cases. – cHao Sep 04 '13 at 03:33
  • 1
    I completely agree. They have two different purposes, plain and not-so-simple. There are things `volatile` *is designed* to accommodate. Just as there are things synchronization objects are designed to accommodate. They do different things, and you'll be just as likely to see me say "synchro-objects are *not* designed to do that; volatile is" when the situation calls for it as I am about not using volatile for what synchronization objects are designed for. volatile is certainly not the devil, but it certainly *can* be if not used for what it was intended. =P. Likewise the other way around. – WhozCraig Sep 04 '13 at 03:49
  • @WhozCraig: Also this code could be compiled so that `ptr` is stored in register. I see no guarantee that this will never happen without `volatile`. Btw. no one said that `volatile` is used for synchronization, it is however a prerequisite to use synchronization concepts. – bkausbk Sep 04 '13 at 07:55
  • @bkausbk and with that we've officially reached the 100,000 ft stratosphere. I've written too many mountains of thread-safe code without once having to pin `volatile` into the source to even discuss what will ultimately go nowhere. (which is not to say I've never used `volatile`; I have; but not because I was afraid of a global variable being harbored in a register). I'll chalk it up to luck. Each and every time (I'm off to buy a Lotto ticket now). I wish you the very best. – WhozCraig Sep 04 '13 at 08:29
  • 1
    @bkausbk: `volatile` is *not* a prerequisite to use synchronization concepts. The compiler is not allowed to cache a global variable in a register across a non-inlineable function call like `pthread_mutex_lock()` because it has no way of knowing if that function might modify the global variable, so it must reload the variable from memory. – Adam Rosenfield Sep 04 '13 at 17:46
  • @AdamRosenfield: Ok if this is right `volatile` would not be required in this case. Do you have any official information where I can read this? – bkausbk Sep 05 '13 at 06:01
  • @bkausbk: See the C99 language standard: §5.1.2.3 for program execution, §6.7.3/6 for `volatile`, and annex C for sequence points. The key text is §5.1.2.3/2, which says "[...] modifying an object [...] are all *side effects* [...]. At [...] *sequence points*, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place." – Adam Rosenfield Sep 05 '13 at 18:30
  • @AdamRosenfield: Ok, but it doesn't look like that it has something to do with my question. You said "The compiler is not allowed to cache a global variable in a register across a non-inlineable function call" or am I missing something. – bkausbk Sep 05 '13 at 21:33
  • @bkausbk: A global variable is an object. If you read that variable, call a function which modifies that variable, and then read that variable again, the *abstract machine* specified in the C language standard says that when you read that variable for the second time, it will have the value set in the function call. A conforming implementation is not required to implement the abstract machine exactly, but it is required to match it in observable side effects at sequence points. Hence, after the function call, the value read must be the most recent value written. – Adam Rosenfield Sep 06 '13 at 05:06
  • @bkausbk: (continued) So the compiler is not allowed to cache a global variable in a register across an external function call (if the function is in the same translation unit, then the compiler may be able to deduce that the function can't possibly modify the global variable), because doing so would no longer match the semantics of the abstract machine. – Adam Rosenfield Sep 06 '13 at 05:10
  • @AdamRosenfield: Regarding your statement about global variables: So are you saying that global variables are only global within that file? And what happens if you have a static global variable. Is that global within the process or just within the file that it exists? I don't understand what you meant when you said "A conforming implementation is not required to implement the abstract machine exactly, but it is required to match it in observable side effects at sequence points." Thanks! – OfLettersAndNumbers Oct 09 '13 at 18:37
4

This question is the textbook example of what makes concurrent programming difficult. A really thorough explanation could fill an entire book, as well as lots of articles of varying quality.

But we can summarize a little. A global variable is in a memory space visible to all the threads. (The alternative is thread-local storage, which only one thread can see.) So you would expect that if you have a global variable G, and thread A writes value x to it, then thread B will see x when it reads that variable later on. And in general, that is true -- eventually. The interesting parts are what happens before "eventually".

The biggest source of trickiness are memory consistency and memory coherence.

Coherence describes what happens when thread A writes to G and thread B tries to read it at nearly the same moment. Imagine that thread A and B are on different processors (let's also call them A and B for simplicity). When A writes to a variable, there is a lot of circuitry between it and the memory that thread B sees. First, A will probably write to its own data cache. It will store that value for a while before writing it back to main memory. Flushing the cache to main memory also takes time: there's a number of signals that have to go back and forth on wires and capacitors and transistors, and a complicated conversation between the cache and the main memory unit. Meanwhile, B has its own cache. When changes occur to main memory, B may not see them right away — at least, not until it refills its cache from that line. And so on. All in all, it may be many microseconds before thread A's change is visible to B.

Consistency describes what happens when A writes to variable G and then variable H. If it reads back those variables, it will see the writes happening in that order. But thread B may see them in a different order, depending on whether H gets flushed from cache back to main RAM first. And what happens if both A and B write to G at the same time (by the wall clock), and then try to read back from it? Which value will they see?

Coherence and consistency are enforced on many processors with memory barrier operations. For example, the PowerPC has a sync opcode, which says "guarantee that any writes that have been made by any thread to main memory, will be visible by any read after this sync operation." (basically it does this by rechecking every cache line against main RAM.) The Intel architecture does this automatically to some extent if you warn it ahead of time that "this operation touches synchronized memory".

Then you have the issue of compiler reordering. This is where the code

int foo( int *e, int *f, int *g, int *h) 
{
   *e = *g;
   *f = *h;
   // <-- another thread could theoretically write to g and h here
   return *g + *h ;
}

can be internally converted by the compiler into something more like

int bar( int *e, int *f, int *g, int *h) 
{
  int b = *h;
  int a = *g;
  *f = b ;
  int result = a + b;
  *e = a ;
  return result;
}

which could give you a completely different result if another thread performed a write at the location given above! also, notice how the writes occur in a different order in bar. This is the problem that volatile is supposed to solve -- it prevents the compiler from storing the value of *g in a local, but instead forces it to reload that value from memory every time it sees *g.

As you can see, this is inadequate for enforcing memory coherence and consistency across many processors. It was really invented for cases where you had one processor that was trying to read from memory-mapped hardware -- like a serial port, where you want to look at a location in memory every n microseconds to see what value is currently on the wire. (That is really how I/O worked back when they invented C.)

What to do about this? Well, like I said, there are whole books on the subject. But the short answer is that you probably want to use the facilities your operating system / runtime platform provide for synchronized memory.

For example, Windows provides the interlocked memory access API to give you a clear way of communicating memory between threads A and B. GCC tries to expose some similar functions. Intel's threading building blocks give you a nice interface for x86/x64 platforms, and the C++11 thread support library provides some facilities also.

Crashworks
  • 40,496
  • 12
  • 101
  • 170
0

My question is: Does threadB see that g_ptr is modified?

Probably. g_ptr is accessed by threadB via read_from_addr(), so the same g_ptr is seen all the time. This has nothing to do with the “intramodular globalness” of g_ptr: it would work just as well if g_ptr were declared static and had internal linkage since, as you have written it here, it appears at file scope before read_from_addr().

Or does it still see that it's pointing to NULL?

Probably not. Once the assignment is made, it's visible to all threads.

The issue here is that if you have two threads accessing shared data where at least one thread is writing to it (which is the case here), you need to synchronize access to it because ordinary memory reads and writes are not atomic. In POSIX, for example, the behaviour under these circumstances is formally “undefined”, which basically means all bets are off and your machine can go rogue-o-matic and eat your cat as far as the standard is concerned.

So you will really want to use an appropriate thread synchronization primitive (e.g. a read/write lock or a mutex) to ensure a well-behaved program. On Linux with pthreads, you'll want to look at pthread_rwlock_* and pthread_mutex_*. I know that other platforms have equivalents, but I have no clue what they are.

Emmet
  • 6,192
  • 26
  • 39
-1

global variables are available to all the threads.

For Ex:

struct yalagur
{
char name[200];
int rollno;
struct yalagur *next;
}head;

int main()
{
thread1();
thread2();
thread3();
}

now above structure is shared between all the threads.

any thread can access the structure directly.

so this is called shared memory between threads.

u need to use mutex/shared variables / etc concept to update/read/delete the shared memory.

Thanks Sada

Sada
  • 1