1

Say I have the following classes:

public class Item { 
public: 
  CString name;  
  int id;
  UINT type;
  bool valid;  
  void invalidate(){
    valid = false;
  }
  ...
}

public class itemPool { 
public: 
  static std::vector<Item*> items ; 
  void invalidateOfType(UINT type){
    for( auto iter : items )
       if ( iter->type == type )
         iter->invalidate();
  }
 ...
}

Can I call the "invalidateOfType(UINT type)" - method from different threads? Is there any possibility of "undefined behaviour" ? In other words, can I use static resources from in multiple threads ( make parallel calls to that resource ) ?

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • 1
    `if ( iter->type = type )` is going to give you problems. – RyanP Nov 06 '15 at 18:15
  • What kind of threads are we talking about? Windows threads? POSIX pthreads? C++-11 threads? – David Schwartz Nov 06 '15 at 18:21
  • 1
    The short answer is that a static field is the same as a global variable in terms of thread-safety (the initialization is thread safe though; see related questions). As you can see, there's a lot of discussion, so could you clarify by adding all method signatures that access the `items` vector? Are there any other places `items` might be initialized, changed, or accessed? – Kenney Nov 06 '15 at 18:43
  • @Kenney Items are initialized from the main application thread. At that time, no other threads which might access "items" are running. I am tying "type" to a single thread. So usually one thread will only work on one "type" of item. – Sascha Minor Nov 06 '15 at 18:59
  • It then appears that the answers below are all wrong, and your code is thread safe ;-) – Kenney Nov 06 '15 at 19:36
  • @Kenney, do not generalize. I was saying the code is thread safe for 2 hours now. – SergeyA Nov 06 '15 at 19:39
  • @SergeyA [your answer from one hour ago](http://stackoverflow.com/posts/33573121/edit/73c2426d-e85a-4282-8a10-26160c89ee56) wasn't saying that. I read all the answers and comments before posting that comment. You just updated your answer after reading these comments. Synchronisation issue perhaps ;-) – Kenney Nov 06 '15 at 19:46
  • @Kenney,we, of all persons, should understand the relativity of time :D :D :D – SergeyA Nov 06 '15 at 19:47
  • I kept that information back because I was interested in that particular call in general. – Sascha Minor Nov 06 '15 at 20:04

4 Answers4

2

Static resources are no different than any shared resources. Unless their methods are thread-safe, they should not be called from multiple threads simultaneously. In your particular case, it boils down to the question of invalidate() being thread safe. Iterating over vector itself is thread-safe.

Quite unexpectedly (to me!) the question turned out into something very interesting and educational. Following are points of interest to remember here. In explaining those, I will take the code at the face value. I will also operate under the assumption (actually clarified by OP in some of the comments) that no code is READING while the invalidation takes place.

The code as written would iterate over the same vector at the same time. Since iterating the vector which is not modified during iteration is thread safe, this part is thread safe and needs no further discussion.

The second question is 'can two or more threads execute invalidateOfType for the same type at the same type'? If the answer is NO - every thread has it's own type - than again, the code is 100% thread safe, since same objects are not accessed from more than one thread and no further discussion is neccessary.

If the answer to the above question is 'YES', than we have a conondrum. Effectively it boils down to the question 'when two or more threads set the same memory location to the same value at the same time, is it going to produce unexpected results'? Precise reading of standards does not give a straight answer.

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • iterating is safe only as long as there are no reallocations, since `items` is completely exposed, I wouldn't call it safe in this case – BeyelerStudios Nov 06 '15 at 18:20
  • This answer is wrong. Iterating over `vector` is not thread safe in regards to other threads writing into it. – Blindy Nov 06 '15 at 18:20
  • A slight nuance:"Unless their methods are thread-safe, they cannot be called from multiple threads simultaneously *with predictable effects*". Of course those methods can be called from multiple threads simultaneously. – Kenney Nov 06 '15 at 18:21
  • @Kenney, and it is me who is often penalized for being pendatic :) I will change wording slightly. – SergeyA Nov 06 '15 at 18:22
  • @Blindy, I suggest you actually read the question. It asks very specific question regarding multithreading access, and not mentions a thing about insertion. – SergeyA Nov 06 '15 at 18:22
  • @SergeyA I understand the manipulation of `items` as a kind of access, the access to the vector that is. – BeyelerStudios Nov 06 '15 at 18:23
  • @BeyelerStudios your undestanding is incorrect. As a matter of fact, the code as posted is thread safe. – SergeyA Nov 06 '15 at 18:25
  • The code as posted iterates over a 0-length array. If that's what you call successful programming then sure, completely thread safe. – Blindy Nov 06 '15 at 18:28
  • @SergeyA no it is not: what if an `Item` of type A is being invalidated in one thread, in another one of type B and they coincidentally share memory accross a cache line? nothing is preventing the write from one thread to overwrite the write of the other then, that includes the value of `valid`. – BeyelerStudios Nov 06 '15 at 18:30
  • @BeyelerStudios it wouldn't make a difference in the posted code: the `valid` field is never read or used in a conditional. It is only set to false. Overwriting the false with false makes no real difference. – Kenney Nov 06 '15 at 18:31
  • @BeyelerStudios true - but they write the same value. I do not expect this to provide any sort of undefined behaviour. – SergeyA Nov 06 '15 at 18:31
  • @Kenney sorry but there's always a read: the read of the cache line – BeyelerStudios Nov 06 '15 at 18:31
  • @SergeyA they have different types, one writes `items[i]` another writes `items[i+1]` – BeyelerStudios Nov 06 '15 at 18:32
  • @BeyelerStudios, care to elaborate on read? Who is reading here? Better in terms of asembly opcodes. – SergeyA Nov 06 '15 at 18:32
  • @BeyelerStudios not for write-only operations like `mov [mem_addr], 0`; for `xchg [mem_addr], eax`, yes. – Kenney Nov 06 '15 at 18:33
  • @BeyelerStudios, I am not sure I follow. Would you mind posting an answer with your reasoning? I would be especially interested in seeing the cache lines reads explicitly shown. – SergeyA Nov 06 '15 at 18:33
  • @Kenney pinned it. This is why I asked for assembly. – SergeyA Nov 06 '15 at 18:33
  • @Kenney since `vector` is a container with implementation dependent behaviour and no compiler intrinsics are used here, there's absolutely no way to reason about which instructions will be generated by your compiler. @SergeyA http://www.akkadia.org/drepper/cpumemory.pdf a cache line is usually loaded into memory, the value written to and then written as a whole back into memory (i.e. it is flagged dirty first), two unloaded lines can be read and written to concurrently, this is what's called a race condition, it's not on a single variable, it's on a whole cache line. – BeyelerStudios Nov 06 '15 at 18:48
  • 1
    @BeyelerStudios I see what you mean. But then your answer is wrong, because not only the `invalidate` method on the object must be thread safe, but any method modifying any data that might be part of the same cache line. For instance, the preceding/next item, or even the vector instance itself. – Kenney Nov 06 '15 at 18:53
  • @BeyelerStudios - disagree with you and agree with Kenny. If what you are saying would be true, no one could actually write any multithreaded program reasonably - since even accessing two unrelated variables in two different threads would be prone to data race condition you are mentioning, if they happen to be in the same cache line. Since obviously it is not the case, you are wrong :) – SergeyA Nov 06 '15 at 18:56
  • @BeyelerStudios You are totally and completely incorrect. Hardware makes the caches coherent. Cache lines have *no* effect on threads whatsoever (other than affecting performance). (This is a very common misconception, but it is just that, a misconception. The threading issues come from compiler optimizations, non-atomic operations, posted writes, and prefetches -- not data caches.) – David Schwartz Nov 06 '15 at 18:57
  • 1
    @BeyelerStudios plain wrong. Utterly, plainly wrong. I would not be surprised if such hardware is what they make very bad developers to work on in hell as a punishment for their sins, but our real life hardware certainly does not work this way. – SergeyA Nov 06 '15 at 18:59
  • @BeyelerStudios, I can only repeat myself. Consider simple code `int a; int b; thr1() { a = 10; do(a); } thr2() { b = 20; do(b); }`. Anyone will tell you the code is thread safe. You are telling it is not. Reconsider your position. – SergeyA Nov 06 '15 at 19:12
  • @BeyelerStudios Complete nonsense. Threads don't write cache lines, they write to objects. If you try to fix your example to avoid operations that don't actually exist, you'll find that there's no problem. Both modifications will occur correctly. I can't fully educate you in the form of comments to this answer, but if you ask your own question, I'll be happy to. (It sounds like your question is "How is thread safety possible on modern CPUs?") – David Schwartz Nov 06 '15 at 19:12
  • thanks for the discussion, so reading from multiple threads is should be safe. What about writing? Other than wiriting two different values at the same time wouldn't make much sense to the applicaiton itself. – Sascha Minor Nov 06 '15 at 19:30
  • @SaschaMinor, the whole disucssion here is about it. So far we are uncertain. I will update my answer to include some of the things we ARE certain. – SergeyA Nov 06 '15 at 19:31
  • @BeyelerStudios The only time that two threads can actually execute simultaneously is in multi-core CPU's, and luckily section 3.3.4 in the awesome link you shared (thanks!) explains that this is handled at hardware level. – Kenney Nov 06 '15 at 19:31
  • David: from section 3.2: *A cache line which has been written to and which has not been written back to main memory is said to be “dirty”. Once it is written the dirty flag is cleared.* so threads do write lines, even when modifying them only partially. @Kenney this doesn't take into account the effects of the instruction pipeline of your cpu, i.e. the invalidation queue: *Consequently, a CPU can be oblivious to the fact that a cache line in its cache is actually invalid* source: [wiki](https://en.wikipedia.org/wiki/MESI_protocol#Memory_Barriers). I'd be happy to change my position. – BeyelerStudios Nov 06 '15 at 19:55
  • @BeyelerStudios That looks like a serious design flaw at the hardware level! As for a single-threaded single core CPU, if we're talking about a second thread being scheduled to run on the same CPU, then the instruction pipeline will be flushed with a [serialisation instruction](http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf#G13.8467), typically an `iret`. And, if even a single threaded application isn't thread safe, we might as well quit programming... :-) – Kenney Nov 06 '15 at 20:24
  • @Kenney I agree, on single cpu this is safe – BeyelerStudios Nov 06 '15 at 20:27
  • @BeyelerStudios You are hopelessly confused, and I don't know how to unconfuse you. You cite a direct quote about what the cache does and somehow think it's what a thread does. I don't know how to fix your confusion. If two threads write to different objects that share a cache line, one thread must write first because a cache line cannot be exclusive in two caches. Whichever thread writes second will be writing to the line the first thread modified, so no change will be lost. You seem to not understand the entire point of the MESI protocol. – David Schwartz Nov 08 '15 at 00:04
  • @DavidSchwartz I was confused. I arrived at the same conclusion now. No thanks to your arguments though, "plain wrong" and "hopelessly confused" didn't help me read through chapter 3. But thanks for pointing out an overly pessimistic outlook I had and making me study up. – BeyelerStudios Nov 09 '15 at 11:39
  • 1
    @BeyelerStudios Those weren't for your benefit, they were for the benefit of people who might have listened to you. It's perfectly fine to ask question and to seek to gain knowledge. But I have very little patience for people who spew misinformation in answers to other people's questions, leading them into the weeds. (That's why I encouraged you to ask your own question.) – David Schwartz Nov 09 '15 at 11:45
  • you demonstrated a lot of patience in flaming though, consider this – BeyelerStudios Nov 09 '15 at 11:47
  • @BeyelerStudios I prioritize stopping the spread of harmful misinformation over being nice to people who spread harmful misinformation. If you ask your own question, I'll demonstrate extraordinary patience in sharing my knowledge with you, but if you authoritatively state misinformation as answers to other people's questions even after it's clear that something is wrong with your argument, the gloves come off and I tell it like it is. – David Schwartz Nov 09 '15 at 17:05
1

No, you cannot. This could result in two threads executing valid = false; at the same time on the same valid. It is not permissible to modify an object in one thread while another thread is, or might be, accessing it. (To be sure, check the docs for the particular threading model or library you are using, but most have this rule.)

I would consider this okay on Windows, because everyone does it. It's unlikely that some subsequent change to the platform will break everyone's code. I wouldn't do this on POSIX platforms because the documentation is pretty clear that it's not allowed and it's not commonly done.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • this is an interesting question. is it undefined to set the variable to the same value from multiple threads, when there is no reading from it at all? I doubt it is undefined. – SergeyA Nov 06 '15 at 18:26
  • @SergeyA It depends on the threading library/platform. POSIX pthreads, for example, is quite clear that it is undefined if an object can be modified in one thread while another thread is, or might be, accessing it. Windows threads are too poorly specified to know. – David Schwartz Nov 06 '15 at 18:55
  • 'Accessing' is vague. Does it define 'access'? – SergeyA Nov 06 '15 at 19:00
  • @SergeyA I'm not sure if it's precisely defined, but the understanding is that it means any code that could potentially read or write to the object. It is unfortunately vague and most of my understanding comes from talking to the guys who wrote the standard and made the early implementations. (It is not hard to create fairly simple code such that we can't tell from just the specification whether it's supposed to be allowed or not, unfortunately. Windows threads are even worse, with almost nothing about the memory rules documented.) – David Schwartz Nov 06 '15 at 19:06
  • David, fair enougth. I believe, when standards are vague, we should look into real life. Since we (I grant you that, you might as well grant me that) know a thing or two about multithreading on hardware level, do we expect real hardware to give us a problem here? Ignoring cache lines nonsense comments above, I dare say, it will not. The simple assignment will be translated to simple mov, which will be executed and will update the cache, and, in it's due time, memory. Even torn writes (which I am yet to see) would not cause a problem here! – SergeyA Nov 06 '15 at 19:10
  • 1
    @SergeyA Right, and with some threading models (I'm looking at you Windows!) the best you can do is say "everyone else does this, so I'll do it too". But this is not valid POSIX pthreads code, and the documentation says so. Everyone does this with Windows threads, so ... – David Schwartz Nov 06 '15 at 19:18
  • But we did agreed that 'access' itself is not defined on Posix either, didn't we? – SergeyA Nov 06 '15 at 19:29
  • @SergeyA I don't believe it's specifically defined, but it's understood to mean code that can read or write to the object. This lack of specificity has caused problems in the past. (For example, is it legal for a compiler to "optimize" `if (have_lock) j++;` to `j++; if (! have_lock) j--;`? This could be a real optimization, but if compilers are allowed to do it, it's really hard to write thread-safe code.) But I think we do the OP a disservice if we let his simple question lead down a rathole. – David Schwartz Nov 07 '15 at 03:36
  • as a matter of fact, this optimization is allowed as long as havelock or j are not atomic. Atomic lock or j will prevent it. I agree we are going deep in the dark. Btw, i spent some time over weekend reading link provided by Beyeler, and of course it explains exactly why what Beyeler is saying is not true. – SergeyA Nov 07 '15 at 22:12
  • @SergeyA If that were true, multithreaded code would be impossible to write safely. Imagine if you have three threads, each of which writes to a particular element of an array with something like `array[thread_id]++;`. If you were right, this could be changed to `array[0]++; if (thread_id !=0) { array[0]--; array[thread_id]++; }` which would make pretty much any threaded code that accessed indexed objects unsafe. And locks wouldn't even help you (because the threads would be grabbing different locks and the code would be entirely inside the scope of the lock). – David Schwartz Nov 09 '15 at 11:30
  • No, this is a different example. Code optimizations are not allowed to touch unrelated entities, otherwise untouched. Original code modifies j regardless of flow of execution. The code you've posted modifies array[0] only if thread_id is 0. This is different. Compiler is allowed the first optimization, and does not allow the second (pretty much because compiler is not allowed to randomly set values of variables which were never referenced in the functions and than revert back. – SergeyA Nov 09 '15 at 14:23
  • @SergeyA Except in the `have_lock` example, it's possible that `have_lock` is always false and thus `j` is never modified, just like it's possible thread `thread_id` is never zero and therefore `array[0]` is never modified. Unless you want to argue that somehow members of an array are "special", an argument for which there's no basis. – David Schwartz Nov 09 '15 at 17:03
  • No, members of array are not special. Compiler knows nothing about what is always what (unless dealing with constants, but this is completely different matter). it follows the code. When the code modifies variable under certain condition, compiler assumes that programmer took all neccessary precautions to make sure those modifications are correct ALWAYS. And optimizes. When the code as written NEVER touches an entity, compiler is not allowed to touch this entity. Your code is no different from `int i; int j; void foo() { i++;}` - bot i and j are global. Compiler can't touch j in foo(). – SergeyA Nov 09 '15 at 17:15
  • @SergeyA Say I do this: `void foo (int *j) { *j++; }`. May the compiler optimize this to `q++; if (j != &q) { *j++; q--; }`? Your argument implies yes. (Since the code *could* modify `q`, and therefore the compiler is free to modify `q`.) But if so, then writing multithreaded code is again impossible. – David Schwartz Nov 09 '15 at 19:00
  • No, David, why? My argument implies exactly that it can not. This code can only touch a memory pointed by j. The code as written does not touch q. So q can not appear in optimized code. In fact, no location other than pointed by j can be touched in the code. With the location pointed to j compiler is free to do anything it wants - multiply, divide, extract square root or randomize - as long as it can be proven that upon exiting the function result will always be the same as it would be if code was generated as written (or, if code as written produces undefined behavior with regards to j). – SergeyA Nov 09 '15 at 19:07
  • You said, "as a matter of fact, this optimization is allowed as long as havelock or j are not atomic". Now you're saying it's not possible? Yes or no, can: `if (have_lock) ++j;` be optimized to `++j; if (!have_lock) --j;`? – David Schwartz Nov 09 '15 at 19:20
  • I am not sure why I am not clear. Original f allowed for optimization, since it did touch j (absent atomic or volatile). This is exactly what I said above - if code touches entity, optimizer has a carte blanche. The latter code does not touch q, so it is not allowed to start touching it after optimization. So the answer is still the same. Yes, first optimization is allowed, no, second is not. – SergeyA Nov 09 '15 at 19:40
  • @SergeyA Somehow we're talking past each other. Is the optimization of `if (have_lock) j++;` to `j++; if (! have_lock) j--;` legal or not? You said "yes", but now it seems you're saying "no". (Since that optimization changes cases that don't modify `j` into cases the do modify `j`.) Is it your position that any compiler optimization that causes a write to be added to an object the unoptimized code would not write to is illegal? Do you realize how many optimizations that prohibits? – David Schwartz Nov 10 '15 at 00:07
  • Compiler optimizes on per-function basis. (Inlines are considered a part of caller function). The first function can touch j, so any optimizations to j are possible. Second function does not touch q, so no optimizations to q are allowed. Same answer. – SergeyA Nov 10 '15 at 00:45
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/94659/discussion-between-david-schwartz-and-sergeya). – David Schwartz Nov 10 '15 at 00:47
0

If your question is, calling invalidateOfType() simultaniously from different threads lead to data curruption (one thread reading the other writing to the same object), then the answer is yes.

But you can protect the resource, in this example the items vector, with a std::mutex and std::lock_guard like:

class itemPool {
public:
   static std::vector<Item*> items;
   static std::mutex         items_mutex;

   void invalidateOfType(UINT type) {
      std::lock_guard< std::mutex > scoped_lock(items_mutex);
      for (auto iter : items)
         if (iter->type = type)
         {
            iter->invalidate();
         }
   }
   ...
}

If thread1 is just executing invalidateOfType and thread2 does a call to invalidateOfType, then thread2 has to wait until thread1 has left the function and the items_mutex is unlocked.

Do this with every resource you share accross threads to prevent corruption.

Viktor Liehr
  • 508
  • 4
  • 14
  • Instead of following this advice you might as well just write all your programs in single thread. – SergeyA Nov 06 '15 at 18:24
  • I beg your pardon, who is talking about a single thread program? – Viktor Liehr Nov 06 '15 at 18:27
  • Victor, you are. I have said it multiple times, the art of multithreading is not in spraying your code with mutexes. It is with understanding your access and designing it in such a way that it can be avoided. – SergeyA Nov 06 '15 at 18:29
  • @SergeyA: What are you talking about?? Tell me where I'm about a singled threaded program and spraying mutexes ! Did you even read the question and my answer to it? – Viktor Liehr Nov 06 '15 at 18:33
  • I did. And your answer is incorrect. Op does not need a mutex here. – SergeyA Nov 06 '15 at 18:34
  • @SergeyA: So what now, did I or you? Find your stand. You think my answer is incorrect, then you can explain why you think that, can you? – Viktor Liehr Nov 06 '15 at 18:36
  • I just did. Your answer is incorrect. 'If your question is, calling invalidateOfType() simultaniously from different threads lead to data curruption then the answer is yes.'. The answer is NO in the code as posted. – SergeyA Nov 06 '15 at 18:38
  • @SergeyA: No you didn't you removed " (one thread reading the other writing to the same object)" from the quote! – Viktor Liehr Nov 06 '15 at 18:40
  • Because you reformated the question to suit your agenda. The question is clear, and OP clarified couple of times that no read is happening while writing. – SergeyA Nov 06 '15 at 18:41
  • @SergeyA: What did I reformat? Did you really read the whole question(s). I'll help you a bit with "In other words, can I use static resources from in multiple threads ( make parallel calls to that resource ) ?". – Viktor Liehr Nov 06 '15 at 18:43
  • And what? Yes, OP is making parallel calls. And yes, those calls are thread safe. I am not sure what you are confused about. – SergeyA Nov 06 '15 at 18:48
  • @SergeyA: You create the confusion talking about praying mutexes, removing text in my quote, and so on. So again, the question was "make parallel calls to that resource", the resource is 'static std::vector items;'. A call can read or modify the source and this is what I stated in the answer and showed a solution how to prevent it. So what now? – Viktor Liehr Nov 06 '15 at 18:55
  • Now that again you did not read the question and provided the solution for unexisting problem. I am going to repeat it one last time. Code provided (even if we make sure vector used is not local function variable) is thread-safe. It does not need mutexes. – SergeyA Nov 06 '15 at 19:06
  • @SergeyA: Absolutely vector is thread safe... in a certain way. Probably you should this this [link](http://stackoverflow.com/questions/9042571/is-stdvector-or-boostvector-thread-safe). And one more time for you buddy, "make parallel calls to that resource": A call can read and/or modify the source... And how can you be sure that there is only one thread writing and all others are reading? Your comment do not make much sense, or you are writing something different than you think, sorry... – Viktor Liehr Nov 06 '15 at 19:13
-1

Can I call the "invalidateOfType(UINT type)" - method from different threads?

Dear god, no! Simply touching that array from another thread while the invalidateOfType function is running is sufficient to crash your program instantly. There's no locks anywhere.

At the very least you should lock (ie mutex) access to the array itself.

Blindy
  • 65,249
  • 10
  • 91
  • 131
  • Why are you saying this???? Multiple threads can iterate the same vector as many times as they want. – SergeyA Nov 06 '15 at 18:18
  • But inserting an element in a vector that's iterated on, or removing one, that isn't thread-safe. Downvote all you want out of ignorance, it doesn't change anything. Feel free to educate yourself by reading the section 23.2.2 Container data races [container.requirements.dataraces] of the C++11 document. – Blindy Nov 06 '15 at 18:20
  • I figured that deleting, inserting, swapping will cause problems. But why is iterating and setting a bad idead? – Sascha Minor Nov 06 '15 at 18:23
  • @SaschaMinor, it will not. – SergeyA Nov 06 '15 at 18:25
  • Because you're iterating without checking a mutex. Someone somewhere is presumably filling that array, and even if you lock *that* down behind a mutex, this function isn't locked and will cause race conditions. – Blindy Nov 06 '15 at 18:25
  • @Blindy, only in your head someone is presumably doing something with this array. Not in the code posted, and not in the OP's program, as I can see. – SergeyA Nov 06 '15 at 18:27
  • Most people actually have elements in their arrays. Explains a lot that you think programs doing nothing are relevant. – Blindy Nov 06 '15 at 18:30