I have observed that when the linux futexes are contended, the system spends A LOT of time in the spinlocks. I noticed this to be a problem even when futexes are not used directly, but also when calling malloc/free, rand, glib mutex calls, and other system/library calls that make calls to futex. Is there ANY way of getting rid of this behavior?
I am using CentOS 6.3 with kernel 2.6.32-279.9.1.el6.x86_64. I also tried the latest stable kernel 3.6.6 downloaded directly from kernel.org.
Originally, the problem occurred on a 24-core server with 16GB RAM. The process has 700 threads. The data collected with "perf record" shows that the spinlock is called from the futex called from __lll_lock_wait_private and __lll_unlock_wake_private, and is eating away 50% of the CPU time. When I stopped the process with gdb, the backtraces showed the calls to __lll_lock_wait_private __lll_unlock_wake_private are made from malloc and free.
I was trying to reduce the problem, so I wrote a simple program that shows it's indeed the futexes that are causing the spinlock problem.
Start 8 threads, with each thread doing the following:
//...
static GMutex *lMethodMutex = g_mutex_new ();
while (true)
{
static guint64 i = 0;
g_mutex_lock (lMethodMutex);
// Perform any operation in the user space that needs to be protected.
// The operation itself is not important. It's the taking and releasing
// of the mutex that matters.
++i;
g_mutex_unlock (lMethodMutex);
}
//...
I am running this on an 8-core machine, with plenty of RAM.
Using "top", I observed that the machine is 10% idle, 10% in the user mode, and 90% in the system mode.
Using "perf top", I observed the following:
50.73% [kernel] [k] _spin_lock
11.13% [kernel] [k] hpet_msi_next_event
2.98% libpthread-2.12.so [.] pthread_mutex_lock
2.90% libpthread-2.12.so [.] pthread_mutex_unlock
1.94% libpthread-2.12.so [.] __lll_lock_wait
1.59% [kernel] [k] futex_wake
1.43% [kernel] [k] __audit_syscall_exit
1.38% [kernel] [k] copy_user_generic_string
1.35% [kernel] [k] system_call
1.07% [kernel] [k] schedule
0.99% [kernel] [k] hash_futex
I would expect this code to spend some time in the spinlock, since the futex code has to acquire the futex wait queue. I would also expect the code to spend some time in the system, since in this snippet of code there is very little code running in the user space. However, 50% of time spent in the spinlock seems to be excessive, especially when this cpu time is needed to do other useful work.