Looks like tat your multiprocessor architecture is NUMA which implies CPUs have different access time to different memory regions. Systems with such architecture balance the load between CPUs in order to load only those processors which are "closer" (have lowest access time) to the memory area is being operated on. In case of ordinal .NET application standard memory layout implies residing in the same memory area which on NUMA architectures leads to utilizing only those CPUs which are closer to this are (in your case it can be 2 NUMA nodes with 1 CPU each and only 1 us used because it servers the memory area your applications uses).
The applications which need to get benefits from the NUMA architecture are supposed to use specific APIs which expose among other calls the calls to indicate in which NUMA node to allocate a memory (here is example of API functions provided by Windows). The .NET CLR starting from version 4.5 is able to utilize this API indirectly by specific configuration settings. On the CLR runtimes you need to set the following options in the application settings:
<configuration>
<runtime>
<gcServer enabled="true"/>
<Thread_UseAllCpuGroups enabled="true"/>
</runtime>
</configuration>
Where gcServer
mode controls the NUMA awareness for the memory allocations so that runtime can support multiple heaps for different NUMA nodes and Thread_UseAllCpuGroups
controls the NUMA awareness for the tread pool.
However for .NET Core runtimes you need to turn on gcServer mode by using runtime options while Thread_UseAllCpuGroups
which is part of ThreadPool settings can be passed via an environment variable according to this with prefix COMPlus_
(i.e. set COMPlus_Thread_UseAllCpuGroups=1
).