-2

I have a multi-line textbox and I want to process each line with multi threads.

The textbox could have a lot of lines (1000+), but not as many threads. I want to use custom amount of threads to read all those 1000+ lines without any duplicates (as in each thread reading UNIQUE lines only, if a line has been read by other thread, not to read it again).

What I have right now:

private void button5_Click(object sender, EventArgs e)
{
    for (int i = 0; i < threadCount; i++)
    {
        new Thread(new ThreadStart(threadJob)).Start();
    }
}

private void threadJob()
{
    for (int i = 0; i < txtSearchTerms.Lines.Length; i++)
    {
        lock (threadLock)
        {
            Console.WriteLine(txtSearchTerms.Lines[i]);
        }
    }
}

It does start the correct amount of threads, but they all read the same variable multiple times.

Salah Akbari
  • 39,330
  • 10
  • 79
  • 109
gafs
  • 73
  • 1
  • 7

3 Answers3

3

Separate data collection and data processing and next possible steps after calculation. You can safely collect results calculated in parallel by using ConcurrentBag<T>, which is simply thread-safe collection.
Then you don't need to worry about "locking" objects and all lines will be "processed" only once.
1. Collect data
2. Execute collected data in parallel
3. Handle calculated result

private string Process(string line)
{
    // Your logic for given line
}

private void Button_Click(object sender, EventArgs e)
{
    var results = new ConcurrentBag<string>();

    Parallel.ForEach(txtSearchTerms.Lines,
                     line =>
                     {
                         var result = Process(line);
                         results.Add(result);
                     });

    foreach (var result in results)
    {
        Console.WriteLine(result);
    }
}  

By default Parallel.ForEach will use as much threads as underlying scheduler provides.

You can control amount of used threads by passing instance of ParallelOptions to the Parallel.ForEach method.

var options = new ParallelOptions
{
    MaxDegreeOfParallelism = Environment.ProcessorCount
};
var results = new ConcurrentBag<string>();
Parallel.ForEach(values,
                 options,
                 value =>
                 {
                     var result = Process(value);
                     results.Add(result);
                 });
Fabio
  • 31,528
  • 4
  • 33
  • 72
0

Consider using Parallel.ForEach to iterate over the Lines array. It is just like a normal foreach loop (i.e. each value will be processed only once), but the work is done in parallel - with multiple Tasks (threads).

var data = txtSearchTerms.Lines;
var threadCount = 4; // or whatever you want

Parallel.ForEach(data, 
    new ParallelOptions() { MaxDegreeOfParallelism = threadCount },
    (val) =>
    {
        //Your code here
        Console.WriteLine(val);
    });

The above code will need this line to be added at the top of your file:

using System.Threading.Tasks;

Alternatively if you want to not just execute something, but also return / project something then instead try:

var results = data.AsParallel(new ParallelLinqOptions()
{
    MaxDegreeOfParallelism = threadCount
}).Select(val =>
{
    // Your code here, I just return the value but you could return whatever you want
    return val;
}).ToList();

which still executes the code in parallel, but also returns a List (in this case with the same values in the original TextBox). And most importantly, the List will be in the same order as your input.

mjwills
  • 23,389
  • 6
  • 40
  • 63
-1

There many ways to do it what you want.

Take an extra class field:

private int _counter;

Use it instead of loop index. Increment it inside the lock:

private void threadJob()
{
    while (true)
    {
        lock (threadLock)
        {
            if (_counter >= txtSearchTerms.Lines.Length)
                return;
            Console.WriteLine(txtSearchTerms.Lines[_counter]);
            _counter++;
        }
    }
}

It works, but it very inefficient.

Lets consider another way. Each thread will handle its part of the dataset independently from the others.

public void button5_Click(object sender, EventArgs e)
{
    for (int i = 0; i < threadCount; i++)
    {
        new Thread(new ParameterizedThreadStart(threadJob)).Start(i);
    }
}

private void threadJob(object o)
{
    int threadNumber = (int)o;
    int count = txtSearchTerms.Lines.Length / threadCount;
    int start = threadNumber * count;
    int end = threadNumber != threadCount - 1 ? start + count : txtSearchTerms.Lines.Length;

    for (int i = start; i < end; i++)
    {
        Console.WriteLine(txtSearchTerms.Lines[i]);
    }
}

This is more efficient because threads do not wait on the lock. However, the array elements are processed not in a general manner.

Alexander Petrov
  • 13,457
  • 2
  • 20
  • 49