1

I'm working with a class which contains complex properties. Each of these properties are computed through different methods. I'm using Parallel.Invoke to update different properties of the same object. Will this cause any issue to the object ?

// sample class definition. I've simplified the example by using 'object' type
// for complex types. 
public class TestResult
{
     public object Property1;

     public object Property2;

     public object Property3;
}

// here we populate an object. We are processing it parallelly because each method
// takes some considerable amount of time. 
var testResult = new TestResult();
Parallel.Invoke(
() =>
{
       testResult.Property1 = GetProperty1Value();
},
() =>
{
       testResult.Property2 = GetProperty2Value();
},
() =>
{
       testResult.Property3 = GetProperty3Value();
});

Will the above code cause ANY issues to testResult object?

Note: I've tested this part of code. Doesn't seem to cause any issues. As far as I know, since different properties are getting worked upon in different tasks, this shouldn't been an issue. I couldn't find any documentation around this. I wanted to confirm this behavior hence asking this question.

SRIDHARAN
  • 1,196
  • 1
  • 15
  • 35
  • Somewhat related: [How to use Task.WhenAll to run 2 calculations at once](https://stackoverflow.com/questions/68924697/how-to-use-task-whenall-to-run-2-calculations-at-once) – Theodor Zoulias Sep 11 '21 at 18:06
  • The answer is `yes, lots of issues`. This exact snippet won't has threading issues because it's trivial. Nothing prevents any of the delegates from trying to modify multiple properties, or trying to use property values that were modified by another delegate. – Panagiotis Kanavos Sep 14 '21 at 13:51
  • could you please explain what issues are there ? – SRIDHARAN Sep 17 '21 at 06:12

1 Answers1

1

First and foremost it should be mentioned that the Property1, Property2 and Property3 in your example are technically called fields, not properties.

Your example is perfectly safe regarding the integrity of the TestResult instance, after the Parallel.Invoke operation has successfully completed. All of its fields will be initialized, and their values will be visible by the current thread (but not necessarily visible by other threads that were already running before the completion of the Parallel.Invoke).

On the other hand if the Parallel.Invoke fails, then the TestResult instance may end up being partially initialized.

If the Property1, Property2 and Property3 were actually properties, then the thread-safety of your code would depend on the code running behind the set accessors of those properties. In case this code was trivial, like set { _property1 = value; }, then again your code would be safe.

As a side note, you are advised to configure the Parallel.Invoke operation with a reasonable MaxDegreeOfParallelism. Otherwise you'll get the default behavior of the Parallel class, which is to saturate the ThreadPool.

TestResult testResult = new();

Parallel.Invoke(new ParallelOptions()
{ MaxDegreeOfParallelism = Environment.ProcessorCount },
    () => testResult.Property1 = GetProperty1Value(),
    () => testResult.Property2 = GetProperty2Value(),
    () => testResult.Property3 = GetProperty3Value()
);

Alternative: In case you are wondering how you could initialize a TestResult instance without relying on closures and side-effects, here is one way to do it:

var taskFactory = new TaskFactory(new ConcurrentExclusiveSchedulerPair(
    TaskScheduler.Default, Environment.ProcessorCount).ConcurrentScheduler);

var task1 = taskFactory.StartNew(() => GetProperty1Value());
var task2 = taskFactory.StartNew(() => GetProperty2Value());
var task3 = taskFactory.StartNew(() => GetProperty3Value());

Task.WaitAll(task1, task2, task3);

TestResult testResult = new()
{
    Property1 = task1.Result,
    Property2 = task2.Result,
    Property3 = task3.Result,
};

The values of the properties are stored temporarily in the individual Task objects, and finally they are assigned to the properties, on the current thread, after the completion of all tasks. So this approach eliminates all thread-safety considerations regarding the integrity of the constructed TestResult instance.

But there is a disadvantage: The Parallel.Invoke utilizes the current thread, and invokes some of the actions on it too. On the contrary the Task.WaitAll approach will wastefully block the current thread, letting the ThreadPool do all the work.


Just for fun: I tried to write an ObjectInitializer tool that should be able to calculate the properties of an object in parallel, and then assign the value of each property sequentially (thread-safely), without having to manage manually a bunch of scattered Task variables. This is the API I came up with:

var initializer = new ObjectInitializer<TestResult>();
initializer.Add(() => GetProperty1Value(), (x, v) => x.Property1 = v);
initializer.Add(() => GetProperty2Value(), (x, v) => x.Property2 = v);
initializer.Add(() => GetProperty3Value(), (x, v) => x.Property3 = v);
TestResult testResult = initializer.RunParallel(degreeOfParallelism: 2);

Not very pretty, but at least it is concise. The Add method adds the metadata for one property, and the RunParallel does the parallel and sequential work. Here is the implementation:

public class ObjectInitializer<TObject> where TObject : new()
{
    private readonly List<Func<Action<TObject>>> _functions = new();

    public void Add<TProperty>(Func<TProperty> calculate,
        Action<TObject, TProperty> update)
    {
        _functions.Add(() =>
        {
            TProperty value = calculate();
            return source => update(source, value);
        });
    }

    public TObject RunParallel(int degreeOfParallelism)
    {
        TObject instance = new();
        _functions
            .AsParallel()
            .AsOrdered()
            .WithDegreeOfParallelism(degreeOfParallelism)
            .Select(func => func())
            .ToList()
            .ForEach(action => action(instance));
        return instance;
    }
}

It uses PLINQ instead of the Parallel class.

Would I use it? Probably not. Mostly because the need for initializing an object in parallel doesn't come very often, and having to maintain so obscure code for such rare occasions seems like overkill. I would probably go with the dirty and side-effecty Parallel.Invoke approach instead. :-)

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
  • 1
    Incisive answer. can you please clarify when Paralle.Invoke will fail ? I thought ALL the actions will be executed regardless whether an action had run into exception. it will be great if you can point me to the documentation. Thanks! – SRIDHARAN Sep 11 '21 at 14:38
  • Hi @SRIDHARAN. Actually you are right. All `Action`s passed to the `Parallel.Invoke` are invoked no matter what. I was pretty confident that all methods of the `Parallel` class would have the same behavior, but apparently the `Parallel.Invoke` is a different beast. I am a bit disappointed to be honest, because I consider the fail-fast strategy a very desirable feature for a tool that facilitates parallelism. On the other hand this behavior can be enforced simply by not using the `Parallel.Invoke`, and using the `Parallel.ForEach` instead. So it's not a biggie really. – Theodor Zoulias Sep 11 '21 at 15:36
  • 1
    @SRIDHARAN I added one more approach. It's not very practical, but at least you might find it entertaining. :-) – Theodor Zoulias Sep 11 '21 at 17:45
  • 1
    looks good. bit of an overkill but as you mentioned it justifies "just for fun" :D Thanks! – SRIDHARAN Sep 12 '21 at 07:40