1

I'm trying to transfer this prime sieving function to use Parallel.For so it can utilize multiple cores.

However, when I run it, the value of the b variable seems to randomly jump or even not change at all, especially for higher values of To.

static List<long> Sieve(long To)
{
    long f = 0;
    To /= 2;

    List<long> Primes = new List<long>();
    bool[] Trues = new bool[To];

    if (To > 0)
        Primes.Add(2);

    long b = 0;

    Parallel.For(1L, To, a =>
    {
        b++;

        if (Trues[b])
            return;

        f = 2 * b + 1;
        Primes.Add(f);

        for (long j = f + b; j < To; j += f)
            Trues[j] = true;
    });

    return Primes;
}

What's going on, and how can I stop that from happening?

  • You have multiple threads reading and modifying a variable without any locking, which will lead to the kinds of things you are seeing. – bmm6o Jan 31 '13 at 21:04
  • Inside a _Parallel.For_, you are in a multi-treaded context so the value b can change between _b++_ and _if(Trues[b])_ and between _if(Trues[b])_ and _f = 2 * b + 1_. And the _Primes.Add(f)_ can have unexpected behavior given that _List<>_ is not thread safe. – Cédric Bignon Jan 31 '13 at 21:05
  • For starters, your counter variable `b` is shared between the threads. If all you do is implement a locking mechanism on `b`, you may as well not use threading at all. You will have to restructure this logic. – JosephHirn Jan 31 '13 at 21:08
  • Basically, you're trying to use a sequential implementation, changed the `for` to a `Parallel.For`, and expected it to work. It's...not that easy. Writing an effective parallelized version of this algorithm is much less trivial, even using the fantastic TPL support of C#. To give you a basic idea of what you're up for, we spent about 3 weeks of our parallel programming class working on just this one algorithm (we went through various iterations of more and more efficient parallel algorithms, but still). – Servy Jan 31 '13 at 21:09
  • See [here](http://stackoverflow.com/a/2774237/238419) for more information on how to correctly use `Parallel.For` - you *don't* want to synchronize access to `b`, as suggested by the answers so far... – BlueRaja - Danny Pflughoeft Jan 31 '13 at 21:34

3 Answers3

1

b is shared across threads. What do you expect to happen if multiple threads bang on that poor variable at once?

It seems like b and a are always equal in your code (or differing by one). Use a. And synchronize access to all other shared state (like the list).

usr
  • 168,620
  • 35
  • 240
  • 369
1

Welcome to the wonderful world of multithreading.

Right off the bat, I can see that every iteration of your loop does a b++ and then uses b throughout its course. This means that every iteration of your loop will be modifying the value of b in the midst of all other iterations.

What you probably want to do is use the a variable made available in your inline function, which does exactly what you seem to be trying to do with b. On the off chance that this is not the case, then you should look into locking b and copying its value to a local (to each iteration) variable before doing stuff to it.

Try this instead and let me know if it's what you wanted to do:

static List<long> Sieve(long To)
{
    To /= 2;

    List<long> Primes = new List<long>();

    if (To > 0)
        Primes.Add(2);

    Parallel.For(1L, To, a =>
    {
        long f = 2 * a + 1;
        Primes.Add(f);
    });

    Primes.Sort();

    return Primes;
}
Theodoros Chatzigiannakis
  • 28,773
  • 8
  • 68
  • 104
  • When I used the a variable, it still has those random jumps, and even more frequently, which is why I switched to using the b variable. Are you saying that if I used a it shouldn't be jumping like that? –  Jan 31 '13 at 21:09
  • 1
    The fix isn't actually that easy. Using his current algorithm each iteration is dependent on all previous iterations having run; he needs a significant rework of the algorithm to remove that dependency. – Servy Jan 31 '13 at 21:17
  • @Ipquarx `a` will still have "jumps", but at least it will have a consistent value throughout an iteration, while `b` won't (and the latter is *definitely* wrong behavior). That being said, using `a` will probably allow you to remove the `if` and use the rest of the code pretty much as it is. Of course, `Primes` will be unsorted, but at least it'll have one value per iteration and you can sort it right after the loop. – Theodoros Chatzigiannakis Jan 31 '13 at 21:20
  • @Servy I'm a little tired so I may be missing something you're seeing, but I don't see the dependency. The only dependency (sort of) I see is in the `if` statement and it's probably there because of the use of `b` instead of `a` - and if he switches back to `a` it won't be required. – Theodoros Chatzigiannakis Jan 31 '13 at 21:22
  • @TheodorosChatzigiannakis The issue is with respect to access to `Trues`. Basically, each iteration looks at the "current position" to determine if it's been marked as true or false, and then based on that value computes whether or not a bunch of higher indexes are true or not and sets their value. If those indexes are processed in parallel (or even sequentially, but not in the proper order) then when checking the "current" value previous iterations had no chance to "set" it. – Servy Jan 31 '13 at 21:24
  • @Servy True, but the way I understand it, he's only using `Trues` to counter the fact that the iteration order is non-deterministic. If you completely ignore that and just sort `Primes` after the loops, it might not be needed. See my edit. (Note: I know I may be *dead* wrong guessing OP's intentions, of course, so in the end you may be right and I apologize in advance if that's the case.) – Theodoros Chatzigiannakis Jan 31 '13 at 21:29
  • 1
    @TheodorosChatzigiannakis Looking at your proposed algorithm it 1) doesn't actually generate a list of all prime numbers from 0 to `To` and 2) runs in O(N) time, when it should be running in much less than that. -- Some of this is based on errors in the OP's sequential version, some based on your improper conversion to a parallelized version. You might want to just [look up the correct algorithm](http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). – Servy Jan 31 '13 at 21:29
  • @TheodorosChatzigiannakis You're basically saying, "skrew the algorithm, I'll just brute force the whole thing since I'm in parallel) by removing what you have. The entire *point* of the algorithm is that you don't do anything for the vast majority of values. – Servy Jan 31 '13 at 21:31
  • @Servy Yes, I now see what you're saying (and I *was* dead wrong, what a surprise). I only focused on the variable issue and disregarded the point of the algorithm. You are absolutely correct, thanks. – Theodoros Chatzigiannakis Jan 31 '13 at 21:34
1

The problem you are facing here is called race conditions, it's what happens when multiple CPU cores load the same variable into their respective cache, work on it, then write the value back to RAM. Obviously, the value that's written back to RAM may already be old in the meantime (like when a core loads the variable right before it's overwritten with another value)

First of all: I wouldn't use b++ but int i = Interlocked.Increment(ref b); instead. Interlocked.Increment ensures that no 2 threads attempt to increment the same value at the same time. The result is the incremented value which will be saved into the variable i. This is very important, because you will need that value to remain constant for every iteration of your for-loop, which would be impossible otherwise, since other threads will be incrementing this variable.

Next is your variable f and a (defined as the For-iterator). Forget f, use a instead.

f = 2 * b + 1; // wrong
a = 2 * b + 1; // correct

Lastly: System.Collections.Generic.List is NOT, I repeat (because it's important) NOT thread safe. See http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx for more details.

Primes.Add(f); // will likely break something
lock (Primes)  // LOCK the list
{
    Primes.Add(a); // don't forget, we're using 'a' instead of 'f' now
}

The lock keyword accepts only reference-type variables as argument, that is because locking a variable does NOT prevent another thread from accessing it. Instead, you can imagine it as setting a flag on top of the reference, in order to signal other threads I'm working here, please do not disturb!

Of course, if another thread attempts to access Primes without asking to lock it beforehand, the thread will still be able to access the variable.

You should've learned all of this though, since the Parallel Prime Sieve is one of the most common beginner exercises when first learning about multithreading.

EDIT:

After all the above steps are done, the program shouldn't run amok; however this does not mean that the solution will be correct, or that you'll have gained a speedup, since many of your threads will be doing duplicate work.

Assume Thread a is given the responsibility to mark every multiple of 3, while Thread n is given the responsibility to mark the multiples of 9. When run sequentially, by the time Thread n begins processing the multiples of 9, it will see that 9 is already a multiple of another (prime) number. However, since your code is now parallel, there is no guarantee that 9 will be marked by the time Thread n begins its work. Not to mention that - since 9 may not be marked - might be added to the list of prime numbers.

Because of this, you have to sequentially find all prime numbers between 1 and the square root of To. Why the square root of To? That's something you'll have to find out yourself.

Once you have found all prime numbers from 1 to the square root of To, you can start your parallel prime sieve in order to find the rest of the primes, using all primes found previously.

One last noteworthy point would be, that Primes should be built only after Trues has been filled. That's because:

1. Your threads will only have to process the multiples of 2 to the square root of To, thus will in the current implementation not add any more elements to Primes beyond the root.

2. If you choose to have your threads go beyond the root, you'll face the problem, that one of your threads will add a non-prime number to Primes shortly before another thread marks that number as non-prime, which is not what you want.

3. Even in the event that you were lucky and all elements of Primes are indeed all prime numbers between 1 and To, they may not necessarily be in order, requiring Primes to be sorted first.

Nolonar
  • 5,962
  • 3
  • 36
  • 55
  • 1
    None of the points here address the fundamental algorithmic problem that each iteration is dependent on the previous iteration having executed. The basic algorithm needs to change to remove that dependency. Using a thread save structure and synchronizing access to `b` won't fix all of that. – Servy Jan 31 '13 at 22:05
  • @Servy You are right; but the question was "why do my variables misbehave and how do I stop this", not "why does my algorithm not work". The other problem involves sequentially solving the sieve for sqrt(To) and use the result of that as a base for all other threads to work on. Should I edit my answer to accommodate this part as well? – Nolonar Jan 31 '13 at 23:12