3

I have a python script that I would like to rewrite in C# that makes use of multiprocessing. I have been reading about C# multithreading and multiprocessing and I am thoroughly confused. Many of the articles suggest using TPL or something like Parrallel.Foreach but many of the pages start by mentioning multiple cores then quickly switch back to talking about threads. In Python I had to specifically use the multiprocessing module to achieve this (see: Multiprocessing vs Threading Python )

I wrote a small sample console app to test:

class Program
    {
        static void Main(string[] args)
        {
            List<int> testList = new List<int>();
            testList.AddRange(Enumerable.Range(1, 20));

            Parallel.ForEach(testList, x => Console.WriteLine(Test(x)));

            Console.ReadLine();
        }

        private static string Test(int i)
        {
            Thread.Sleep(1000 * (21 - i));
            return "P: " + Process.GetCurrentProcess().Id.ToString() + ", T: " + Thread.CurrentThread.ManagedThreadId.ToString();
        }
    }

However, it shows one process ID and multiple thread IDs. Now I am not sure if my method is correct and if I am understanding everything correctly.

To give a bit of background my application requires a lot of computation on small amounts of datasets (+-300 data points), but due to a huge combination of parameters, I need these to run as fast as possible to save time.

Basically what I want to know is if Parallel.Foreach threads will automatically run on different cores or is there something else I need to do.

Community
  • 1
  • 1
Talib
  • 335
  • 3
  • 15
  • 2
    The Microsoft .NET runtime maps .NET threads to Windows threads 1:1 (so each .NET thread is a Windows OS thread). Different Windows threads are normally scheduled by the Windows scheduler to different CPU cores. So you don't have to do anything. – xanatos Apr 24 '17 at 11:54
  • 1
    `Parallel.ForEach` will create `Task`s which are logical abstraction for pieces of work. These pieces will be scheduled to run above the thread pool, that operates `Thread` objects which will use system threads inside one system process to complete the job. The threads will be scheduled by OS and will probably run on different cores (not 100% because you can't directly manipulate scheduling, but most probably they will). So, it will work this way: ForEach -> Tasks -> Scheduling in thread pool -> .net Threads -> OS Threads -> OS scheduling -> CPU cores – Sergey.quixoticaxis.Ivanov Apr 24 '17 at 11:56
  • 3
    Note that having multiple threads call `Console.WriteLine` buys you nothing. The console output is synchronized to make it possible to call it from multiple threads, but it won't be any faster. Whenever multiple threads call `Console.WriteLine`, only one thread at a time can do anything. The rest will be blocked, waiting. – Cody Gray - on strike Apr 24 '17 at 11:56
  • Thank you, Console is just for testing. I won't be using it for the actual program. The actual program will iterate through my data points and do calculations depending on a combination of parameters and then return a data object that gives my results which I plan to store in a list, upon which I plan to do some filtering. – Talib Apr 24 '17 at 12:05

1 Answers1

3

The Microsoft .NET runtime maps .NET threads to Windows threads 1:1 (so each .NET thread is a Windows OS thread). Different Windows threads are normally scheduled by the Windows scheduler to different CPU cores. So you don't have to do anything.

As always remember that threads are "expensive" objects. Unless you can have "much" work for each thread, it is useless to use them (don't use threads for unit of work < 1 sec, unless you have very specific necessities)

xanatos
  • 109,618
  • 12
  • 197
  • 280
  • Are you sure about that first sentence? I could have sworn that the CLR did exactly the opposite, *not* tying managed threads directly to native/OS threads, which is why it can switch a managed thread from one native thread to another as it sees fit. Maybe this is something that has changed in more recent versions of the CLR? – Cody Gray - on strike Apr 24 '17 at 11:59
  • @CodyGray In the beginning Microsoft wanted to not tie .NET threads to OS threads, to be able to use Windows Fibers with the SQL .NET runtime (fibers are like coroutines), but then they discovered to many problems so even the SQL .NET runtime doesn't use fibers. – xanatos Apr 24 '17 at 12:01
  • [This](https://msdn.microsoft.com/en-us/library/74169f59(v=vs.110).aspx) seems to agree with you @CodyGray – DavidG Apr 24 '17 at 12:01
  • @DavidG Yes, but note the example given: *Specifically, a sophisticated host can use the Fiber API to schedule many managed threads against the same operating system thread, or to move a managed thread among different operating system threads.*.. There is no such host. So it is only a theoretical exercise. For example, from the page of the SQL server: https://docs.microsoft.com/en-us/sql/database-engine/configure-windows/clr-enabled-server-configuration-option: *Common language runtime (CLR) execution is not supported under lightweight pooling* (the lightweight pooling is the use of fibers) – xanatos Apr 24 '17 at 12:04
  • Mmmh found the article: http://joeduffyblog.com/2006/11/09/fibers-and-the-clr/: *The CLR tried to add support for fibers in Whidbey. This was done in response to SQL Server Yukon hosting the runtime in process, aka SQLCLR. Eventually, mostly due to schedule pressure and a long stress bug tail related to fiber-mode, we threw up our hands and declared it unsupported* – xanatos Apr 24 '17 at 12:08
  • To clarify I am using parameter sweeping as my method of analysis to identify the best combination of parameters. In python, each action takes about 0.3 - 0.5 seconds. I am hoping that this would be faster in C# but what you are saying is that threads won't be the answer for me? – Talib Apr 24 '17 at 12:09
  • @Talib You have to benchmark it... You are in a zone where you can find some improvement in "wall time" (what you can measure with a wrist clock) running at the expense of much more "CPU time" (so your CPU will probably have to work 50% more to give you a 30% less wall time) – xanatos Apr 24 '17 at 12:11
  • @Talib Then it depends if the computer that will run the work is "dedicated" (so you can and should squeeze it out for its CPU time) or it is shared (like a Web server, where squeezing the CPU isn't a good idea and each "process" should be "nice" to the other processes and not try to "steal" CPU time) – xanatos Apr 24 '17 at 12:15
  • @xanatos. Thank you. It is definitely dedicated, or the aim is later to have a dedicated server or cluster. At first, I just need to prove my methods are working which is difficult to do if the wall time is long. So at the moment I just need to shave off minutes. Everything will most likely have to be rewritten as soon as I move to a clustered environment. – Talib Apr 24 '17 at 12:20