3

I have a function that is to be called by a thread initiated with a start and end parameter. The function works fine when run on single, main thread. However, when I try multi threading on it, the code breaks.

The function is as below:

static void processThread(long startLimit, long endLimit)
    {
        long rangeLimit = startLimit;
        while (startLimit < endLimit) {
            rangeLimit = rangeLimit + 100;
            startLimit++;
            Console.WriteLine("Processed for " + startLimit + ", " + rangeLimit); 
            startLimit = rangeLimit;
        }
    }

I am calling it from main as::

int threadCount = 4;
long[] startPoints = new long[threadCount];
long[] endPoints = new long[threadCount];
if ((endLimit / 100) % threadCount == 0)
{
    for (int i = 0; i < threadCount; i++)
    {
        endPoints[i] = endLimit * (i + 1) / threadCount;
    }
    startPoints[0] = 0;
    for (int i = 1; i < threadCount; i++)
    {
        startPoints[i] = endPoints[i - 1];
    }
}
Thread[] threads = new Thread[threadCount];
for (int i = 0; i < threadCount; i++)
{
    threads[i] = new Thread(() => processThread(startPoints[i], endPoints[i]));
    threads[i].Start();
    Console.WriteLine("Started for " + startPoints[i] + ", " + endPoints[i]); 
}

The expected result is something like

Processed for 1, 100
Processed for 101, 200
Processed for 201, 300
Processed for 301, 400
Processed for 401, 500
Processed for 501, 600
Processed for 601, 700
Processed for 701, 800
.....

And so on... But what I am getting is:

Started for 0, 2500
Started for 2500, 5000
Processed for 5001, 5100
Processed for 5101, 5200
Processed for 5201, 5300
Processed for 5301, 5400
Processed for 5401, 5500
Processed for 5501, 5600
Processed for 5601, 5700
Processed for 5001, 5100
Started for 5000, 7500
Processed for 5001, 5100
Processed for 5101, 5200
Processed for 5201, 5300
Processed for 5301, 5400
Processed for 5401, 5500
Processed for 5501, 5600
Processed for 5601, 5700
Processed for 5701, 5800
Processed for 5801, 5900
Processed for 5901, 6000
Processed for 6001, 6100
Processed for 6101, 6200
Started for 7500, 10000
Processed for 6201, 6300
Processed for 6301, 6400
Processed for 6401, 6500
Processed for 6501, 6600
Processed for 5701, 5800

Which has many repeated values and has none from the range 0-2500. I also tried with Task.Factory, and got the same result.

Any help on this would be greatly appreciated.

Pritish Kamath
  • 159
  • 1
  • 8
  • 1
    As an aside: if this is really the sort of work you want to parallelize, use `Parallel.ForEach` rather than writing your own partitioning code. It isn't easy to get right, so it's good that there's framework code. – Jeroen Mostert May 15 '17 at 12:47
  • What do you see in the lines with "Started for ..." ? – Ofir Winegarten May 15 '17 at 12:51
  • @OfirWinegarten edited to add the actual result to the question.. Thanks – Pritish Kamath May 15 '17 at 12:54
  • @JeroenMostert I saw that, but couldn't put together, how Parallel.ForEach would help... – Pritish Kamath May 15 '17 at 12:55
  • Or `Parallel.For`, or `Enumerable.Range(...).AsParallel()`. Depending on what, exactly, you're using the parallel work for (all you've shown here is the partitioning code, which does nothing useful in and of itself). – Jeroen Mostert May 15 '17 at 13:00
  • Are you joining the threads after starting them? Else your program might terminate before all threads are finished – Mats391 May 15 '17 at 13:05
  • @Mats391 I have another loop that Joins all the threads. The effect is the same. – Pritish Kamath May 15 '17 at 13:10
  • 1
    The problem is caused by "capturing the loop variable" – H H May 15 '17 at 13:31
  • 2
    Henk is correct. The fundamental problem here is that `i` is evaluated *when the thread runs*, and not *when the delegate is created*. It reads the *current* value of `i`, which is almost certainly higher than when the delegate was created. – Eric Lippert May 15 '17 at 13:39

2 Answers2

2

I ran your code and actually got an index out of bounds exception when starting the threads. This however worked for me:

Thread[] threads = new Thread[threadCount];
for (int i = 0; i < threadCount; i++) {
    var start = startPoints[i];
    var end = endPoints[i];
    threads[i] = new Thread( () => processThread( start, end ) );
    threads[i].Start();
    Console.WriteLine( "Started for " + startPoints[i] + ", " + endPoints[i] );
}

The issue might be that when the thread accesses startPoints[i] in your code, the i already changed in the main thread.

Mats391
  • 1,199
  • 7
  • 12
  • 2
    This is a correct answer. The lambda does not make a copy of `i` when the lambda is created. The lambda reads **the current value of `i`** when it is executed, but by then, `i` is much larger than it was when the delegate was created. – Eric Lippert May 15 '17 at 13:37
  • 1
    @Mats391, thanks a lot for your help. This answer worked as well. – Pritish Kamath May 15 '17 at 13:39
1

I would not comment on the correct solution for your problem, just to fix your output, you are missing this at the end in order to block main thread before all child ones are finished:

                Console.WriteLine("Started for " + startPoints[i] + ", " + endPoints[i]);
        }
        foreach (var thread in threads)
        {
            thread.Join();
        }

        Console.ReadKey();
    }

One more thing I have fixed in the code:

            int count = i;
            threads[i] = new Thread(() =>
            {
                processThread(startPoints[count], endPoints[count]);
            });

To prevent potential problems where i is invalid when evaluated from child threads.

I would also say that unless you really need to use "old" .net Threads, Parallel library or Tasks should be used for this as suggested by other commenters.

The same just rewritten with Task approach (I have kept it as is just replaced the Thread with Task for better illustration).

var factory = new TaskFactory();
        var tasks = new List<Task>();
        for (int i = 0; i < threadCount; i++)
        {
            int count = i;
            Task task = factory.StartNew(() =>
            {
                processThread(startPoints[count], endPoints[count]);
            });
            tasks.Add(task);
            Console.WriteLine("Started for " + startPoints[i] + ", " + endPoints[i]);
        }

        Task.WaitAll(tasks.ToArray());
Bobo
  • 335
  • 2
  • 10
  • Thanks for replying @Bobo, As I mentioned in my question, I have tried using Tasks, to no success, the Task.Factory attempt lies commented in my editor as it proved to be useless. – Pritish Kamath May 15 '17 at 13:20
  • Hey Bobo, your suggestion for declaring a variable for "count" worked very nicely. Thanks a lot for your help! – Pritish Kamath May 15 '17 at 13:38
  • I ran his original version and had the issue where i would be out of bounds. How is this race condition being created and why does your fix work? If you had a thread array where you created the threads in one for loop and the started them in another why doesn't this fix the race condition? How is i getting passed into this anonymous function? Where can I read more on how this works? – WithMetta May 15 '17 at 14:07
  • 1
    @WithMetta This briefly explains that: https://blogs.msdn.microsoft.com/matt/2008/03/01/understanding-variable-capturing-in-c/ The main point is that the lambda is evaluated when the child Thread/Task body is executed and since then the variable is already changed by the main thread for loop. – Bobo May 15 '17 at 14:25