Memory allocator suggestion

Question

I have multi-threaded section where threads need to allocate several large segments of data, say ~100MB each, to use as buffers. Moreover, buffers may need to be resized at run time Several times.

The natural solution is to use realloc but it may move memory which is not needed. free/malloc pair to resize buffer i am afraid may lead to fragmentation and reserving memory before hand creates other problems.

What can I use instead, to allocate/reallocate memory?

"reserving memory before hand creates other problems" So a pooling allocator is not suitable? And are you wanting to stay standard-ish, or are you looking for approaches for a custom allocation scheme? — Corbin, May 21 '12 at 23:09
@Corbin I don't think pools are good in my case - Linux has first-touch policy in NUMA domanis. I'd like to stay with *portable* solution — Anycorn, May 21 '12 at 23:11
any way of managing the data in the buffers such that you can avoid the resize need? or even put an upper bound on the buffer size? how much memory do you plan to have on the hardware? — NG., May 21 '12 at 23:18
@Anycorn Ah, afraid I'm not going to be much help then. I don't really see much of a reasonable way around a pool while maintaining efficient resizing (and thus the copying resizing sometimes requires). My memory management skills are bad at best though :). — Corbin, May 21 '12 at 23:22

Dietrich Epp · Accepted Answer · 2012-05-21T23:53:58.433

Use free and malloc. This will NOT cause fragmentation problems.

Modern allocators are fairly resistant to memory fragmentation. These days, it takes a fairly pathological program to cause fragmentation problems. Fragmentation was a more severe problem when our programs addressed physical RAM directly, but with virtual memory a large "hole" in a program's heap doesn't need to consume any resources.

Furthermore, due to the size of the buffers, most allocators will request a dedicated region from the kernel for each buffer. On Linux / OS X / BSD, this means an anonymous mmap behind the scenes for each buffer. This can cause fragmentation of address space, but virtual address space is basically free on a 64-bit system, and a few hundred megs isn't a problem on 32-bit either.

So use free and malloc.

Alternative: You might find it faster to make each buffer bigger than you need. The way malloc works on a modern Unix, any pages which you don't write to don't consume memory.

So if you malloc a 500 MB buffer but only use the first 100 MB, your program doesn't actually use more memory than if you malloc a 100 MB buffer and use the whole thing. You get more address space fragmentation this way, but that's not a problem on 64-bit systems, and you can always tune the allocation size so it works on 32-bit systems too.

As for suggestions to use mmap, just think of malloc/free as a simpler interface to mmap/munmap, which is what it is for large allocations (1 MiB is a common threshold).

Is there a reference that describes under what condition fragmentation happens? — Anycorn, May 21 '12 at 23:53
There are some weird things you can do, like allocate a bunch of 16-byte chunks, free the odd-numbered ones, then allocate a bunch of 24-byte chunks, free the odd-numbered ones, and keep on doing things like that. It's basically a worst-case scenario for `malloc`, but there's a bound on the percentage of total memory wasted. — Dietrich Epp, May 21 '12 at 23:58
@Anycorn: Actually, since we're talking about such large buffers, it's really *address space* fragmentation and NOT memory fragmentation. If you're on 64-bit, address space fragmentation is not something you care about. — Dietrich Epp, May 22 '12 at 00:04

score 4 · Answer 2 · answered May 22 '12 at 02:16

Simply use realloc. On a modern system, even if the buffer gets moved to a new address, the move will happen by manipulating the page tables (on Linux, mremap; I'm sure other systems have a similar mechanism) and not by copying data. (Note here that I'm assuming large buffers; for small buffers, usually less than a few hundred kb, actual copying will happen.)

If your target is 64-bit machines, there's absolutely no need to worry about memory fragmentation. You'll never fragment memory badly enough to run out of virtual address space. If you need to handle 32-bit machines too, you're probably safe as long as you don't have too many threads. As long as the total memory consumption is less than 1GB, it would be very hard to run out of virtual address space due to fragmentation, assuming your usage pattern. If you are worried about it, just preallocate the largest size you might possibly need.

score 1 · Answer 3 · edited May 23 '17 at 12:04

1

Implement your solution with malloc/realloc/free and profile it. If memory allocation is a problem, you can use a better implementaion of malloc such as facebook's jemalloc, or google's tcmalloc.

See C++ memory allocation mechanism performance comparison (tcmalloc vs. jemalloc) for a comparison of the two.

They are both pretty good at handling internal/external fragmentations.

edited May 23 '17 at 12:04

Community

1
1

answered May 21 '12 at 23:41

Shayan Pooya

1,049
1
13
22

1

For 100 MB buffers, you won't notice a difference between jemalloc or tcmalloc because they both request pages directly from the kernel for allocations over a certain threshold (1 MiB is common). The default `malloc` on your system will do the same thing. – Dietrich Epp May 21 '12 at 23:49
@DietrichEpp the problem with the default malloc is lock-contention in case of multi-threaded applications. Although, I would certainly profile my program before using them. – Shayan Pooya May 21 '12 at 23:51
1

If you're talking about 100 MB buffers, the cost of a even a highly contested lock is going to be minimal, because you have to make a syscall anyway. – Dietrich Epp May 21 '12 at 23:59
Indeed. Fancy allocators will not help you AT ALL for large buffers like this. All the time (including lock contention, for the global vm lock) will be spent in kernelspace. – R.. GitHub STOP HELPING ICE May 22 '12 at 02:17

Memory allocator suggestion

3 Answers3