malloc fragmentation in lots of 64MB arenas

Question

I am struggling with profiling what looks like internal malloc memory fragmentation in a database server application. To rule out a leak, all malloc, realloc and free calls are wrapped with our own accounting that prepends our own header to bookkeep the memory balance, plus the code is valgrinded using quite a big test suite. Moreover, most of the time we use our custom allocator, directly mmaping pools of memory and doing our own administration.
glibs malloc is used only for some small stuff that doesn't fit in the scheme of our own allocator.

Running a test for a few days that just keeps allocating and freeing a lot of memory in the server (lots of short connections coming and going, lots of DDL operations modifying global catalogs), results in the "RES" memory creeping up and staying up, way above our internal accounting. After these few days of testing, we count a total of about 400TB of memory being malloced/freed, with the balance reported by our accounting varying around a few hundred megabytes to 2-3 GB most of the time (with spikes up to 15GB). The "RES" memory of the process however never goes down below 8.3-8.4GB. Parsing /proc/$PID/maps, practically all of it is in "rw-p" mappings of exactly 64MB (or "rw-p" plus a "---p" reserved "tail") - in a captured snapshot 143 such arenas account almost exactly for such 8.3-8.4 GB of resident memory.

Googling around tells that malloc allocated memory in such 64MB arenas, and that such multiple arenas can cause excessive "VIRT" memory:

However in my case most of the areas are full and actually count to RES not to VIRT (only 9 out of the 143 areas with a "---p" tail of more than 1 MB).

In this case it is just a few GB of memory, but in actual production systems we've seen the disrepancy grow to numbers like 40-50 GB (on a 512 GB RAM server).

Is there a way that I could get more insight into this fragmentation? malloc_info output seems to be somewhat corrupted, reporting some odd numbers like:

<unsorted from="321" to="847883883078550" total="140585643867701" count="847883883078876"/>

- such exact line (exact same "to", "total", and "count") repeats in every heap.

I'm going to test the behaviour of different allocators (jemalloc, tcmalloc) in a similar fashion.

[this answer](http://stackoverflow.com/a/34906859/841108) to a very similar question (near-duplicate) should help — Basile Starynkevitch, Jan 25 '16 at 21:03
How large are the chunks of memory that you allocate with `malloc()` or similar? Perhaps you could tune `M_MXFAST` to help cut down on the fragmentation you're seeing? See http://man7.org/linux/man-pages/man3/mallopt.3.html - and read the "bugs" section. — Andrew Henle, Jan 25 '16 at 23:14
We use malloc either for odd allocations of some structs (so up to few hundred bytes) or for the smallest memory pools in our own allocator (16KB-128KB). For allocations bigger than 128KB we use mmap directly ourselves. We tried using mmap always in our own allocator, but it proved to be significantly slower for these small pools. In the malloc usage stats after a few days of test the small odd malloc allocations amounted to 9TB allocated/freed while the allocations of memory pools of 16-128KB to 435TB. The resident usage stabilized at these 8.3-8.4 GB. — Juliusz Sompolski, Jan 26 '16 at 14:34
As soon as there's memory pressure, the pages that aren't being used will be unmapped and the RSS will go down. — David Schwartz, Sep 13 '16 at 23:24

malloc fragmentation in lots of 64MB arenas

0 Answers0