I've been trying to get a deep understanding of how these concepts relate. Let me give a simple example and explain what thinking so that you can correct it.
Let's say I want to try to sort two arrays
int[] A = { ... }; // very large, very unsorted
int[] B = { ... }; // very large, very unsorted
by sorting each of them "as parallel as my system will allow me to sort them." I take advantage of the fact that a Parallel.ForEach
does a lot of stuff under the hood, and I simply write
var arrays = new List<int[]>(A, B);
Parallel.ForEach(arrays, (arr) => { Array.Sort(arr); });
Now let's say I compile and run it on machines with the following specs:
- 1 processor, 1 core
- 1 processor, multiple cores
- 2 processors, at least one core on each
In case 1, there is absolutely no possibility of a performance gain. It sorts A, then sorts B, just like it would in a regular foreach
loop.
In case 2, there is also no performance gain because unless you have multiple processors then your machine can not literally "do more than 1 thing at once." Even if it ends up sorting them in different threads, the CPU that controls the threads does a little sorting of A, a little sorting of B, a little more of A, etc., which can't be more efficient than just sorting all of A and then all of B.
Case 3 is the only one with a possibility of a performance gain, for the reason mentioned in the previous case.
Can someone critique my understanding? How right or wrong is this? (I didn't major in computer science. So please grade me on a curve.)