I have an application with 4 threads, which I'm running on an 8-core Intel Process with CentOS. When I run two instances of it, the two instances combined run faster than 1 instance. Moreover, for two instances, they run faster when I do not bind each thread to a different core and left scheduling to the scheduler.
Going with that logic, perhaps, one instance is also running slow because scheduler had pinned all the 4 threads to 4 separate cores. When I pin each thread of 2 instances, that is, 8 threads pinned to 8 cores, the two instances combined also take approximately the same time as the one instance.
The application is lock intensive, as there are frequent shared memory acesses. Also note that the CPU overhead remains 0% otherwise, so there are almost no other processes consuming cycles.
Now my question is what is going on here? One explanation that comes to my mind is that when threads are pinned to different cores, they try to acquire the mutexes almost at the same time, and as a result, high contention occurs, but when they are scheduled by the scheduler, they might try to acquire mutexes at slightly different times, thus decreasing contention.
Any opinions! This is really weird, because left to the scheduler, one instance runs slower than two instances combined. How do I compile my benchmarking results. No on would believe me!
Processor Info as given by /proc/cpuinfo is the following.
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 0
cpu cores : 4
apicid : 16
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.15
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5319.96
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 1
cpu cores : 4
apicid : 18
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.04
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.05
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 4
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 2
cpu cores : 4
apicid : 20
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.04
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 4
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.05
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 6
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 3
cpu cores : 4
apicid : 22
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.03
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 6
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.07
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 8
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 0
cpu cores : 4
apicid : 17
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.01
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 9
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.07
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 10
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 1
cpu cores : 4
apicid : 19
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.04
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 11
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 1
cpu cores : 4
apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.00
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 12
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 2
cpu cores : 4
apicid : 21
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.03
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 13
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 2
cpu cores : 4
apicid : 5
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.05
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 14
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 1
siblings : 8
core id : 3
cpu cores : 4
apicid : 23
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.02
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]
processor : 15
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping : 5
cpu MHz : 2660.076
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips : 5320.04
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: [8]