10

How can I confirm that a host is NUMA-aware? The Oracle doc says that NUMA-awareness starts at kernel 2.6.19, but the NUMA man page says that it was introduced with 2.6.14. I'd like to be sure that a Java process started with -XX:+UseNUMA is actually taking advantage of something.

Checking for the numa_maps, I see that I have them:

# find /proc -name numa_maps
/proc/1/task/1/numa_maps
/proc/1/numa_maps
/proc/2/task/2/numa_maps
/proc/2/numa_maps
/proc/3/task/3/numa_maps

Though my kernel is behind what Oracle states:

# uname -sr
Linux 2.6.18-92.el5

I'm currently using 64-bit jdk1.6.0_29 on RHEL5.1.

PhilR
  • 5,375
  • 1
  • 21
  • 27
Christopher Neylan
  • 8,018
  • 3
  • 38
  • 51
  • Did you consider upgrading your kernel to something more recent? – Basile Starynkevitch Apr 11 '12 at 16:17
  • RHEL 5.1 (2007-11-07) is pretty old, perhaps its time to upgrade. – Peter Lawrey Apr 11 '12 at 16:23
  • @Peter In my experience the people working with NUMA processors usually aren't the ones in charge of updating the software and have to go through quite a lot of bureaucracy to get it updated. Just saw a *python 2.4* install last week on a supercomputer with 2k cores.. – Voo Apr 11 '12 at 16:34
  • @aix I assume that could be the case on multi processor machines on a single MB (not sure there), but then not many people have those either and on a single processor Sandy bridge all CPUs should have the same latency to the whole memory? I'm pretty sure Intel describes SB processors as SMPs – Voo Apr 11 '12 at 16:42
  • @Voo: Fair point. I've withdrawn my remark. – NPE Apr 11 '12 at 16:47

2 Answers2

10

The presence of those /proc files indicates that your linux kernel is numa-aware. Don't concern yourself too much comparing version numbers, as, particularly with Oracle / RHEL kernels, the vendors port/backport many features without keeping the version string "up to date".

Other ways of testing the same thing:

$ grep NUMA=y /boot/config-`uname -r`
CONFIG_NUMA=y
CONFIG_K8_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
CONFIG_ACPI_NUMA=y

$ numactl --hardware
available: 2 nodes (0-1)
node 0 size: 18156 MB
node 0 free: 9053 MB
node 1 size: 18180 MB
node 1 free: 6853 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10
PhilR
  • 5,375
  • 1
  • 21
  • 27
1

The Oracle doc also states:

Note: There was a known bug in the Linux Kernel that may cause the JVM to crash when being t with -XX:UseNUMA. The bug was fixed in 2012, so this should not affect the latest versions of the Linux Kernel. To see if your Kernel has this bug, you can run the native reproducer.

Which I have reproduced here to demonstrate its simplicity:

http://docs.oracle.com/javase/7/docs/technotes/guides/vm/reproducer.c

To build the reproducer, you may need to install the numactl or numactl-devel packages depending on your distribution. See man numa_maps for details.

#include <numaif.h>
#include <numa.h>
#include <stddef.h>
#include <sys/mman.h>
#include <stdint.h>

int main(void) {
   if (numa_all_nodes_ptr == (void*)0) {
     return -1;
   }

   size_t pagesize = getpagesize();

   void* mapped_memory = mmap(NULL, 3 * pagesize, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
   if (mapped_memory == MAP_FAILED) {
     return -2;
   }

   void* page0 = mapped_memory;
   void* page1 = (void*)((uintptr_t)page0 + pagesize);
   void* page2 = (void*)((uintptr_t)page1 + pagesize); 

 // Set up the last page as interleaved.
   mbind(page2, pagesize, MPOL_INTERLEAVE, numa_all_nodes_ptr->maskp,
numa_all_nodes_ptr->size, 0);

   // Setup the last two pages as interleaved.
   mbind(page1, 2 * pagesize, MPOL_INTERLEAVE,
numa_all_nodes_ptr->maskp, numa_all_nodes_ptr->size, 0);

   *((char*)page2) = 2;
   *((char*)page1) = 1;
   *((char*)page0) = 0; // Crash here, when mbind_merge was broken.

   return 0;
}

So, I took the ambiguity to mean that 2.6.19 was the first safe version.

Donald_W
  • 1,773
  • 21
  • 35