-4

I am having a problem with my Lock() critical section of the code. The situation is that my output is containing duplicates, I can't have that! Below you can see the code that replicates my situation. I am really dumb founded by my situation, excuse me if I am missing something really silly...

private static object theLock = new object();
private static int currentNumber = 0;

. . .

static void Main(string[] args)
{
    for (int i = 0; i < 30; i++)
        CreateAndRunWorker();
}

. . .

private static void CreateAndRunWorker()
{
    BackgroundWorker worker = new BackgroundWorker();
    worker.DoWork += new DoWorkEventHandler(TheWorkToBeDone);
    worker.RunWorkerCompleted += new RunWorkerCompletedEventHandler(TheWorkAfterWork);
    worker.RunWorkerAsync();
}

. . .

private static void TheWorkToBeDone(object sender, DoWorkEventArgs e)
{
    OutputNum();
}

. . .

private static void TheWorkAfterWork(object sender, RunWorkerCompletedEventArgs e)
{
     CreateAndRunWorker();
}

. . .

private static void OutputNum()
{
    lock (theLock)
    {
        currentNumber++;
        Console.WriteLine(currentNumber);
    }
}

The above is being called by 25 background workers from a Main() function. The lock and counter are initialized globally. The output contains duplicate numbers. How?

Rob Steiner
  • 125
  • 1
  • 10
  • 2
    Show us more code. Is it in winforms? How and where you call `OutputNum`? – Hamid Pourjam Dec 18 '15 at 17:11
  • 2
    I'd say you're probably re-setting "theLock" somewhere. Anyways, the code you've shown us doesn't contain the problem. – Paul Groke Dec 18 '15 at 17:12
  • I'll add more code, one sec. It is a console application. – Rob Steiner Dec 18 '15 at 17:12
  • @PaulGroke I only initialize the lock once as a global. Never touch it till Lock(theLock). – Rob Steiner Dec 18 '15 at 17:19
  • You need to provide a *complete* example that is able to reproduce the problem. You still haven't provided a complete example. – Servy Dec 18 '15 at 17:20
  • There's some strange logic in `TheWorkAfterWork` method or maybe I'm not seeing right. – bokibeg Dec 18 '15 at 17:21
  • @Servy for sake of simplicity, I am removing functions that the BackgroundWorked never touches. The above code is everything that Background worker performs. – Rob Steiner Dec 18 '15 at 17:23
  • Ok I looked at it again and there's NO way you can have "duplicates" because 1st you'd get an exception instead and 2nd it would even work without a lock due to the logic of your code. – bokibeg Dec 18 '15 at 17:24
  • @bokibeg Once a worker stops working, I create another worker to replace it. – Rob Steiner Dec 18 '15 at 17:24
  • 1
    @bokibeg Well, the code will eventually overflow the `int` and start repeating values again. It's unclear if that's what hes observing or not, due to the incomplete information given. – Servy Dec 18 '15 at 17:25
  • Not reproducing within a reasonable subset, a couple million iterations. I'll give a +1 to overflow. Also, @bokibeg, a lock is definitely required or else multiple threads may be modifying `currentNumber` – Glorin Oakenfoot Dec 18 '15 at 17:26
  • @Servy It repeated only once out of 8,000 numbers. It reapeated at 1016. – Rob Steiner Dec 18 '15 at 17:26
  • @RobertSteinerIII Like I said, you need to create a *complete* program that can reproduce your described problem. It should also be a minimal reproducible example and simply not include any code superfluous to replicating the problem. – Servy Dec 18 '15 at 17:26
  • @Servy Good point... Also I've yet to hear about someone waiting through that many `Console.WriteLine`'s :) ([MCVE] or never happened) – Alexei Levenkov Dec 18 '15 at 17:27
  • Oh a side not, this isn't an appropriate use of BGW. It exists explicitly to do work in a UI context where you need to report progress and/or completion to the UI. As you're not in a UI context, it's not the appropriate tool for the job. – Servy Dec 18 '15 at 17:29
  • The only job a worker has is to output to the console. Isn't that UI work? Either way, thread vs bgw isn't that big of a difference. – Rob Steiner Dec 18 '15 at 17:30
  • @RobertSteinerIII But your BGW isn't handling your UI interaction, the method it's calling, that is using none of the BGW features, is what's interacting with your UI. Like I said, the issue here is simply using the wrong tool for the job. There's also the fact that you're having each thread create an entirely new thread after doing a *tiny* amount of work, which is *enormously* wasteful, rather than just having each thread do the work over and over again in a loop. – Servy Dec 18 '15 at 17:32
  • @Servy Thank you for your input. I am not concerned about how wasterful this application might look to you, but I am concerned why I am having repetition in my output. The BGW is calling the OutputNum() function, how is it not handling my UI interaction? – Rob Steiner Dec 18 '15 at 17:36
  • 2
    The main thing about this question is that code being given **works fine**. – bokibeg Dec 18 '15 at 17:38
  • @RobertSteinerIII I specifically said that those points aren't related to your stated problem, they're just causing unrelated problems, and are both practices that you shouldn't actually use in any real code. – Servy Dec 18 '15 at 17:47
  • @RobertSteinerIII please add `readonly` just before `object theLock` just to make sure you're not missing something. (I always make my "lock objects" `readonly` - along with everything else that I can. Makes it harder to make stupid mistakes.) – Paul Groke Dec 19 '15 at 07:02

1 Answers1

-1

You are probably running into some cache effects for local caches on unrelated cores. The issue at hand is the int itself isn't declared or updated in a way that would ensure the local cache is kept up to date. You could address this in a couple ways:

Fixing your implementation

private static readonly object theLock = new object();
private static volatile int currentNumber = 0; // note the addition of "volatile"

private static void OutputNum()
{
    lock (theLock)
    {
        currentNumber++;
        Console.WriteLine(currentNumber);
    }
}

The volatile keyword is important to force all locally cached references to that number to read the same. There are corner cases in the micro-optimizations where executing things out of order can cause a race condition on certain chip architectures. The lock is still required to protect from increment race conditions though.

The readonly keyword is important to prevent your lock object from ever getting overwritten during run time. Once initialized the instance of the readonly lock object can never be changed. If you do change the instance of the lock object, you will have a short period of time where two threads can access the counter variable at the same time.

Getting rid of the lock

If you use the Interlocked.Increment function you can get the effect of what you want without using locks at all.

private static volatile int currentNumber = 0;

private static void OutputNum()
{
    int localRef = Interlocked.Increment(ref currentNumber);
    Console.WriteLine(localRef);
}

This removes the lock completely, keeps the solution more performant since there is less lock contention, but retains all the safety.

IMPORTANT: Use the value returned from Interlocked.Increment for future work in the same method since that won't change from another thread accessing the same code and changing the value before the Console.WriteLine() can execute.

Understanding Why This Works

  1. The CLR performs optimizations and recompiles code at runtime. The volatile keyword ensures the optimizations don't include accessing your counter. (https://msdn.microsoft.com/en-us/library/x13ttww7.aspx)
  2. If the lock instance ever changes during runtime, you will have a brief moment where you can have a race condition. One thread locked on the old object while another is locked on the new object. Always declare your lock objects so that they cannot change. If it's a static object, then static readonly or const is what you want. If it's a field inside an object meant to protect one instance, declare the lock as readonly. The readonly keyword ensures the instance can never change for the life of the containing scope. (https://msdn.microsoft.com/en-us/library/acdd6hb7.aspx)
  3. Interlocked.Increment performs an atomic change that is guaranteed safe across all running cores. If your critical section of code is simply incrementing a counter, this method relieves the system of needing a heavy lock. Just use the return value for the rest of the method so that it stays constant for that scope. (https://msdn.microsoft.com/en-us/library/dd78zt0c(v=vs.110).aspx)

For a good summary of the subject, see https://stackoverflow.com/a/154803/476048

For academic purposes the same problems and solutions exist in Java. The APIs are just a little different.

Community
  • 1
  • 1
Berin Loritsch
  • 11,400
  • 4
  • 30
  • 57
  • 2
    The `lock` statement is *already* adding memory barriers specifically to prevent this. Making the variable volatile isn't changing anything. Removing the lock is only *decreasing* the safety of the code, not improving it. – Servy Dec 18 '15 at 17:30
  • Thank you for your help. Unfortunately, switching to volatile did not fix the issue. – Rob Steiner Dec 18 '15 at 17:38
  • @savvy, read the article I attached at the bottom. Also do you have a good understanding of how local caches in multicore CPUs work? It's clear by your comments you have a very incomplete understanding of what mechanisms C# provides. And Interlocked.Increment is completely threadsafe. Try it. – Berin Loritsch Dec 18 '15 at 17:43
  • @RobertSteinerIII, is your lock object declared as a constant? There is a possibility that the object can be periodically replaced the way you have it instantiated in the code you included. – Berin Loritsch Dec 18 '15 at 17:46
  • @BerinLoritsch Is there any documentation on constant vs static that could be related to this example? I believe this worked and I'm not sure why. I have produced 12k numbers and no repeats. Usually repeated by now... – Rob Steiner Dec 18 '15 at 17:55
  • @BerinLoritsch The documentation for `lock` explicitly states that when crossing the borders of the critical sections it will synchronize the caches, so it is the responsibility of `lock` to ensure that the CPU caches *don't* get out of sync with the memory of the program in this context. The `Interlocked` statement would likely be safe here, bu the point is that using it is *removing many of the safety mechanisms that `lock` is putting in place*. It's making the program *easier* to break in the future, and adding *no* additional safety mechanisms, so it is in no way an improvement. – Servy Dec 18 '15 at 17:58
  • Either way, I'll take this as the answer. It was either the Interlocked code or the const code that fixed it. Both great suggestions and great follow-up post provided. – Rob Steiner Dec 18 '15 at 18:06
  • @RobertSteinerIII Neither suggestion is a meaningful change of the code that you posted. So either your problem wasn't with code posted, you never had a problem in the first place, or this didn't fix it. – Servy Dec 18 '15 at 18:09
  • @Servy Thank you for your concerns. Luckily, I applied his recommended changes and no longer have repeating numbers. I'll credit him for it. – Rob Steiner Dec 18 '15 at 18:12
  • @RobertSteinerIII `new object()` isn't a compile time literal, so **the code posted here doesn't even compile**. Given that the code doesn't even compile, I'd say that it's rather unlikely that using it *when it completely breaks your code* would solve your problem. Sounds like you didn't even apply the changes at all. – Servy Dec 18 '15 at 18:23
  • @Servy Advert your attention to the following screenshot, http://imgur.com/a99KOLB . Not only did I request to edit his post, I did in-fact apply his changes. I'm sorry if this confuses you. – Rob Steiner Dec 18 '15 at 18:27
  • @RobertSteinerIII You should *never* be locking on a string literal. The whole point of creating an object to lock on is to create an object *that is not accessible from any other scope*, else you are at great risk of causing a deadlock. And based on your last comment, it seems that, contrary to your earlier statement, *his proposed changes didn't work*. You just created something entirely separate from his proposed changes (with significant new problems). Of course I'm confused why you're actively trying to encourage people to use completely broken code. It's also rather depressing. – Servy Dec 18 '15 at 18:31
  • @Servy Again, my original concern was eradicating repetition. I changed a variable type, that isn't nearly as radical as you're making it out to be. A 30 line program that I've now run a few times has yet to deadlock. It is under control now and I thank him for it. – Rob Steiner Dec 18 '15 at 18:42
  • @RobertSteinerIII Changing the type of the variable isn't the problem, it's assigning a value to it that's in the intern pool, and is thus accessible *throughout the entire program*. This is highly problematic because, while it may work in a toy program that has literally nothing else going on, it will have all sorts of negative consequences if used in a *real* program that has other synchronization code in it, because unlike what would happen if this was written properly, they would be impacting each other. And why you're thanking him for code *he didn't even provide* makes no sense either. – Servy Dec 18 '15 at 18:45
  • @RobertSteinerIII The fact that you don't realize what you'r proposing is harmful is worrying, the fact that you don't *care* that you *do* know that what you're doing is harmful is rather depressing. The fact that you're actively trying to encourage other people to use solutions that you know are harmful is just appalling. That's just cruel. – Servy Dec 18 '15 at 18:48
  • @Servy I understand your concern, but my toy program works now. My solution has been found. His response contains code, he provided insight that lead to my solution. I can at least say he provided more than you. To each their own. – Rob Steiner Dec 18 '15 at 18:51
  • @RobertSteinerIII So if someone told you that they were having trouble getting their pistol out of their holster, so someone told them that they'd be better off keeping their gun tucked into their pants, loaded, with the safety off, and that they hadn't yet had a problem, You'd be glad to see someone propagating such advice? You'd consider anyone saying that that's extremely unsafe and is very likely to cause serious problems someone that's being unhelpful, because you personally hadn't see any evidence of such a problem? – Servy Dec 18 '15 at 18:58
  • @RobertSteinerIII And for the record, your solution *hasn't* been solved. The provided answer will in no way have fixed the problem you claim to have. So either you had no problem to fix, you still have the same problem, or something *else* that you changed that wasn't suggested in this answer fixed the bug you had in code not shown. You have no idea which, and we have no way of possibly knowing as you haven't provided a complete reproducible example of your code. That you're opposed to even figuring out what the problem was, or how to actually fix it, does not bode well for you. – Servy Dec 18 '15 at 19:01
  • @Servy Like I said, I understand your concern. My toy application (toy gun) is now running with no errors. If was to be working on serious application (real gun), I'd take your advice and search for another solution. We can talk about it later, I'll be back on tonight. – Rob Steiner Dec 18 '15 at 19:03
  • 1
    @RobertSteinerIII So you don't care about getting a working solution at all, because you don't have a real problem in the first place? Then why even ask the question, if you don't care about getting an actual working solution? Why encourage someone for giving you a solution that doesn't work when you don't care if your solution works or not? Why try to get other people to use a solution that doesn't work just because you don't care if you have a working solution? You're basically saying that you're being a troll, because you can. – Servy Dec 18 '15 at 19:06
  • @Servy Yo bruh, I don't know how many times I gotta say it lol But, I applied his changes (with a minor edit) and get no repetition now. My solution, my question, my thanks to give. You gotta relax... – Rob Steiner Dec 18 '15 at 19:08
  • @RobertSteinerIII So you just like being mean to other people for kicks? And you think that other people should just be okay with that, and shouldn't find that concerning? – Servy Dec 18 '15 at 19:17
  • @Servy Back! Lets continue. Who am I being mean to? Possibly you, but I have to defend why I credit this man from your barrage of discredit. Are you taking it personally? I've spoken to no one else, but you... I had a problem,I asked a question, and now it is fixed. A working solution. I can't aruge your own contradiction in that paragraph before I told you to relax, but please understand this solution works for me and I have not forced it on anyone else. – Rob Steiner Dec 18 '15 at 20:43
  • @RobertSteinerIII You're being mean to every single reader of the question who ends up being directed to a completely incorrect answer, while being told that it's a correct answer, by you. I'm not taking it as a personal insult to me that you're intentionally trying to make the world a worse place. I find it depressing and wrong, but I don't think you're doing it to spite me. You do *not* have a working solution. That you can't replicate your problem in no way means this fixed it. The *premise* of your question is that you have a race condition, which, by its very nature, is transient. – Servy Dec 18 '15 at 20:47
  • @RobertSteinerIII And of course the fact that *this solution doesn't even compile* makes it quite clear that it *doesn't* fix the problem. Your completely unrelated change both would result in the change you describe, and introduces *significant* problems into the solution, so it's both not a fix and is actively harmful. You of course know this, and simply don't care. If you don't care about using broken code fine, I'm well past caring if you have a working solution, but encouraging other people to use a broken solution is not something I can ignore. – Servy Dec 18 '15 at 20:50
  • @Servy If it's going to keep you up at night that this solution is fine and works for me, but not for everyone else on earth, then what do you propose? – Rob Steiner Dec 18 '15 at 20:52
  • @RobertSteinerIII The solution isn't fine, and it *doesn't* work for you. You've just said that you don't care that it doesn't work for you. Big difference. If you want to *actually* solve your problem, then you need to start by figuring out what your actual problem is, and to do that, you need to provide a complete reproducible example in the question. If you don't care about getting a solution, then at a minimum, you shouldn't mark an incorrect answer as being correct just to troll other people. – Servy Dec 18 '15 at 20:55
  • @Servy Let me repeat myself, just one more time. I took in the changes above, and it worked. Zero repetition in my output. Thus, it is indeed working for me lol I believe the correct action after taking advice that works is to thank them. The code I provided hasn't changed, except for inserting his changes. Sloppy code? Sure, I understand why you say that. In my 30 line solution, that sloppy code doesn't mess anything up. – Rob Steiner Dec 18 '15 at 21:04
  • @RobertSteinerIII That doesn't mean that your code works, or that your problem has been solved. Again, the whole premise of your question is that you have a race condition. That you ran the program once and didn't see it in no way demonstrates that it's not there. And the fact that you don't care that your changes are going to cause significant problems, or that you don't know enough about what you're doing to even realize what's going on, doesn't mean that they won't mess anything up. – Servy Dec 18 '15 at 21:15
  • @Servy I've now run the program countless times with no repetition. I'd say it is pretty solid, from experience. Code most likely works, if I am getting my desired results. Changes wont cause significant problems, like you've said, it is a toy solution and it's not growing past 30 lines. Problems are contained, everything is working. Much thanks. – Rob Steiner Dec 18 '15 at 21:20
  • @RobertSteinerIII If this is the *only* change that you made, and the code actually didn't work before, then it *unquestionably* doesn't work, because there's no possible way for any of these changes to affect the results in a way that you've described. The changes are *extremely* likely to cause *significant* problems, not only for you, but anyone else that would be unfortunate enough to find this post and take its advice. Saying that you've contained the problem and that it works when it *certainly* doesn't is *far* worse than not fixing the problem and at least knowing you didn't fix it. – Servy Dec 18 '15 at 21:26
  • @Servy We'll just have to agree to disagree.. – Rob Steiner Dec 18 '15 at 21:29
  • @RobertSteinerIII You can disagree over a matter of opinion. It is a matter of *fact* that this answer is wrong, and wouldn't change the behavior described. I guess you could argue that it's your opinion that it's okay for the answer to be completely wrong, not fix the problem, and still be good, but I wouldn't consider that an opinion I'd be willing to reasonable disagree with someone over. – Servy Dec 18 '15 at 21:33
  • @Servy We can Skype Screenshare. I'll show you the facts. – Rob Steiner Dec 18 '15 at 21:38
  • @RobertSteinerIII You don't need to share your screen, you just need to post a complete reproducible code example in the question. Why you've continually refused to do so, thus preventing anyone from actually determining what the answer is, is beyond me. And note that an anecdote is not the same as a fact. If your code has a race condition, (which is what your question is claiming) then it doesn't matter how many times you run the program and it works, the race condition can still be there. – Servy Dec 18 '15 at 21:45
  • This should have been taken to chat guys. That said, if the instance of the object lock object ever changes, you will have a brief moment in time when race conditions can occur. Either `static readonly` or `const` will make sure the reference never changes. NOTE: the `volatile` declaration prevents the CLR from doing some optimizations that can affect the code when run across threads (see https://msdn.microsoft.com/en-us/library/x13ttww7.aspx). – Berin Loritsch Dec 18 '15 at 23:20
  • `Interlocked.Increment` performs an atomic operation, where if the increment is the extent of what you want locked is a logical way to do things in a thread safe manner. See https://msdn.microsoft.com/en-us/library/dd78zt0c(v=vs.110).aspx – Berin Loritsch Dec 18 '15 at 23:22
  • @Servy, it's a matter of fact that the OP's problem is no longer appearing after applying my suggested solution. I've been doing multithreaded programming for a couple decades, and have learned a thing or two. I've even added the proof of why the solution works. Now, it's on you to either provide a better solution and your own competing answer or let it go. Nothing I put forth here is ground breaking and it is _well_ understood by folks who've done quite a bit of multithreaded work. – Berin Loritsch Dec 18 '15 at 23:45
  • @BerinLoritsch The `lock` statement *also applies those same guarantees*, so that was *never* a problem to begin with. Using `volatile` or `Interlocked` *is not fixing a problem of invalid optimizations making the operation not atomic. – Servy Dec 18 '15 at 23:54
  • @BerinLoritsch `it's a matter of fact that the OP's problem is no longer appearing` But it's *not* a fact that this fixed anything, and it's *not* a fact that the race condition doesn't exist. **The OP doesn't have a bug that you fixed**. If you'd posted an answer that just changed the variable names or removed some superfluous whitespace the code would still work because *all of the code that the OP actually showed works just fine*. You just introduced a compiler error and fixed nothing that was broken. Either the OP never had a problem, or it was in code not shown that you didn't fix. – Servy Dec 18 '15 at 23:59
  • @Servy, it's your _opinion_ that my answer is wrong, and until you can prove otherwise, I highly recommend you formulate your own answer that does address the issue that the OP has. And recheck the current answer. And the changes were much more substantial than you are suggesting. – Berin Loritsch Dec 19 '15 at 00:04
  • @BerinLoritsch It's not an opinion. You posted code that doesn't even compile. That's a *fact*. You also claimed that the OP's code can be optimized by the compiler to be not logically atomic *which is completely incorrect*, as the C# specs specifically state the opposite with respect to the `lock` statement in this context. The OP hasn't posted a complete example, so there is no possible way to determine why their code doesn't work as **they haven't shown it**. The question **cannot be answered by anyone** until it provides enough information to actually diagnose the problem. – Servy Dec 19 '15 at 00:08
  • `And the changes were much more substantial than you are suggesting.` They're not *functionally* changing anything. Either both code snippets will work, or neither will. There's no way for one of them to have a bug that the other doesn't. You make like this style better, and that's fine, but *it's not fixing the problem* described in the quesiton. – Servy Dec 19 '15 at 00:13
  • @Servy, how many cores did you throw at the test code to see the problem? Are your running AMD, Intel, or some other CPU? When reproducing race conditions and deadlocks, these things matter. Sometimes it takes the right architecture to experience the bug. Something tells me you aren't testing like you think you are. And I'll stop responding to you in these comments. I get the feeling I'm feeding a troll, but I'll give you the benefit of the doubt. – Berin Loritsch Dec 19 '15 at 03:34
  • 1
    @Berin Loritsch Your explanation isn't correct. lock provides the necessary barriers/fences. If it wouldn't, the language would be utterly broken. – Paul Groke Dec 19 '15 at 06:55
  • @BerinLoritsch So you're saying the entire `lock` statement is completely broken, and just doesn't work at all? Care to back up that statement in any way? Or are you just saying that multithreaded code is so hard to understand you have no idea why you think this work,s but you think it just happens to for some unknown reason and that's just good enough? Again, **where is the bug in the OP's code**? You haven't shown *any* bug in the code, nor have you fixed anything, you've just made purely aesthetic changes and made completely invalid claims that it'll help. – Servy Dec 19 '15 at 16:50
  • @PaulGroke, I'm saying that there are some corner cases where the runtime optimizations breaks some of the expectations on stuff in the lock. If that were not the case, the `volatile` keyword would be utterly unnecessary. This mostly comes to play when optimizing something that can live entirely in the core's L1/L2 cache. If the core's L2 cache is shared (AMD architecture) you are less likely to see that corner case than when the L2 cache (Intel architecture) is not shared across cores. – Berin Loritsch Dec 19 '15 at 18:42
  • The only thing I'm unsure of is if the OP's code changed the lock object instance at runtime through code we didn't see listed. That's the primary reason to have lock objects as `readonly` to prevent that type of mistake by making the compiler catch the mistake. – Berin Loritsch Dec 19 '15 at 18:43
  • @Berin Loritsch Yes, I understand what you're saying. I'm just telling you that you're wrong. – Paul Groke Dec 19 '15 at 19:31
  • If I'm wrong, why is it that the OP's code is now working? Please post your more correct answer. – Berin Loritsch Dec 19 '15 at 19:35
  • @Berin Loritsch The code that the OP posted *does* work. The problem is something he's not showing us. He might have removed the bug by simplifying & shortening the code for posting here. Happens all the time. – Paul Groke Dec 20 '15 at 00:06