1

In Java I can write a Spliterator interface implementation, in which I can specify how exactly a source collection will be split into subcollections for parallel processing.

But I don't understand how to do this in C#, all I know is that you can call AsParallel on IEnumerable, but can you control the splitting process?

For example, my source collection is an array and I want to split it into 5 item subarrays, and want linq to work with those subarrays. Is this possible?

Coder-Man
  • 2,391
  • 3
  • 11
  • 19

2 Answers2

4

If you have specific partitioning requirements, you should have a look at Custom Partitioners for PLINQ and TPL and How to: Implement a Partitioner for Static Partitioning. These give an example of how you can implement a Partitioner<TSource>, which is presumably similar to Java's Spliterator.

However, most of the time, it's more beneficial to let the framework pick its own dynamic partitioning strategy. If your requirement is to limit the level of concurrency to 5, you can use WithDegreeOfParallelism:

var summary = ints
    .AsParallel()
    .WithDegreeOfParallelism(5)
    .Aggregate(
        seed: (
            count: 0,
            sum: 0,
            min: int.MaxValue,
            max: int.MinValue),
        updateAccumulatorFunc: (acc, x) => (
            count: acc.count + 1,
            sum: acc.sum + x,
            min: Math.Min(acc.min, x),
            max: Math.Max(acc.max, x)),
        combineAccumulatorsFunc: (acc1, acc2) => (
            count: acc1.count + acc2.count,
            sum: acc1.sum + acc2.sum,
            min: Math.Min(acc1.min, acc2.min),
            max: Math.Max(acc1.max, acc2.max)),
        resultSelector: acc => (
            acc.count,
            acc.sum,
            acc.min,
            acc.max,
            avg: (double)acc.sum / acc.count));
Douglas
  • 53,759
  • 13
  • 140
  • 188
  • And each partition will be processed by a separate thread, right? – Coder-Man Aug 26 '18 at 14:53
  • Each partition _may_ be processed by a separate thread. The TPL and PLINQ run on the .NET thread pool, which keeps a bounded number of threads (typically equivalent to the number of logical cores in your machine). If your partitioner generates more partitions than there are threads in the thread pool, some of the partitions would be queued. This is the correct thing for it to do – if there are more threads than cores, you would be oversubscribing the processors, which reduces efficiency. – Douglas Aug 26 '18 at 14:58
  • And you can't specify more threads in a thread pool than there are cores? In Java you can create a executor instance and provide the number of threads in constructor. https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int) – Coder-Man Aug 26 '18 at 15:00
  • https://stackoverflow.com/questions/21163108/custom-thread-pool-in-java-8-parallel-stream see Lukas' answer. – Coder-Man Aug 26 '18 at 15:01
  • You can increase the thread count, including for the default thread pool using `ThreadPool.SetMinThreads`. However, it's typically not in your interest to do so for CPU-bound work, since you would only be incurring inefficiencies from the context-switching. – Douglas Aug 26 '18 at 15:02
0

Currently C# dose not offer this feature like Java. Parallel tasks in C# have a MaxDegreeOfParallelism parameter which let you specify maximum number of concurrent tasks enabled by ParallelOptions instance and it automatically handle number of tasks.

Mojtaba Tajik
  • 1,725
  • 16
  • 34