3

I've been reading up a little on zero-pause garbage collectors for managed languages. From what I understand, one of the most difficult things to do without stop-the-world pauses is heap compaction. Only very few collectors (eg Azul C4, ZGC) seem to be doing, or at least approaching, this.

So, most GCs introduce dreaded stop-the-world pauses the compact the heap (bad!). Not doing this seems extremely difficult, and does come with a performance/throughput penalty. So either way, this step seems rather problematic.

And yet - as far as I know, most if not all GCs still do compact the heap occasionally. I've yet to see a modern GC that doesn't do this by default. Which leads me to believe: It has to be really, really important. If it wasn't, surely, the tradeoff wouldn't be worth it.

At the same time, I have never seen anyone do memory defragmentation in C++. I'm sure some people somewhere do, but - correct me if I am wrong - it does not at all seem to be a common concern. I could of course imagine static memory somewhat lessens this, but surely, most codebases would do a fair amount of dynamic allocations?!

So I'm curious, why is that?

Are my assumptions (very important in managed languages; rarely done in C++) even correct? If yes, is there any explanation I'm missing?

E_net4
  • 27,810
  • 13
  • 101
  • 139
Bogey
  • 4,926
  • 4
  • 32
  • 57
  • 1
    You cannot do heap compaction in C++ because that changes the addresses of objects that people have stored pointers to. – Botje Oct 14 '20 at 08:05
  • Thanks, yes I understand that would be rather tricky - but it still doesn't really solve the mystery for me. That implies it's not really done in practice, which would mean: It can't be too important either, as clearly C++ applications are running just fine ;-) But why would managed languages go through the trouble in that case? – Bogey Oct 14 '20 at 08:08
  • 4
    It is very important. Long-running C++ processes can and do suffer from memory fragmentation. Modern memory allocators (like tcmalloc or jemalloc) try to prevent these problems by using several allocation buckets for requests of the same size and managing a free list to make sure holes in these buckets are filled first before asking for extra memory from the OS. Managed languages have an extra incentive to do so because the fewer memory pages your GC has to scan in the mark phase, the faster it runs. – Botje Oct 14 '20 at 08:11
  • You should probably just compare memory management of [tag:c++] to [tag:c++-cli] and nothing else. – Sinatr Oct 14 '20 at 08:11
  • 4
    C++ uses the operating systems memory allocation system (abstracted through `new`/`delete` and `malloc`/`free`). And if the operating system uses virtual memory you really can't compact the "heap". The reason is that all you have is virtual addresses, and compacting virtual memory might not move data as you expect in the physical memory. – Some programmer dude Oct 14 '20 at 08:13
  • 1
    If you're interested in this kind of stuff, read the [jemalloc paper](https://www.bsdcan.org/2006/papers/jemalloc.pdf) or the [mimalloc paper](https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf) as an intro to modern advanced memory management. – Botje Oct 14 '20 at 08:16
  • @Botje the standard does not require that the addresses refer to the physical memory addresses. In theory, those could be virtual addresses, because doing out of bounds memory access is UB. But I'm not aware of an implementation that uses virtual adresses. – t.niese Oct 14 '20 at 08:21
  • I think you meant to tag @Someprogrammerdude ? I never mentioned physical/virtual memory. – Botje Oct 14 '20 at 08:22
  • 2
    Another technique used in C/C++ programs: memory arenas for short-lived, bursty allocations that can be cleaned up in one go. That also prevents memory fragmentation. – Botje Oct 14 '20 at 08:23
  • Thanks all! Very interesting points. @Botje in particular regarding importance of compaction for GC performance - that's a great point. I had only been thinking of it in terms of potential allocation issues due to fragmentation so far. – Bogey Oct 14 '20 at 08:30
  • Lots of good explanations and preventative measures in [this question](https://stackoverflow.com/questions/3770457/what-is-memory-fragmentation), though it's not a dup. – Botje Oct 14 '20 at 08:31
  • Related: https://softwareengineering.stackexchange.com/questions/208182/why-are-reference-counting-smart-pointers-so-popular – Jeremy Friesner Oct 14 '20 at 21:29

2 Answers2

2

Garbage collection can compact the heap because it knows where all of the pointers are. After all, it just finished tracing them. That means that it can move objects around and adjust the pointers (references) to the new location.

However, C++ cannot do that, because it doesn't know where all the pointers are. If the memory allocation library moved things around, there could be dangling pointers to the old locations.

Oh, and for long running processes, C++ can indeed suffer from memory fragmentation. This was more of a problem on 32-bit systems because it could fail to allocate memory from the OS, because it might have used up all of the available 1 MB memory blocks. In 64-bit it is almost impossible to create so many memory mappings that there is nowhere to put a new one. However, if you ended up with a 16 byte memory allocation in each 4K memory page, that's a lot of wasted space.

C and C++ applications solve that by using storage pools. For a web server, for example, it would start a pool with a new request. At the end of that web request, everything in the pool gets destroyed. The pool makes a nice, constant sized block of RAM that gets reused over and over without fragmentation.

Garbage collection tends to use recycling pools as well, because it avoids the strain of running a big GC trace and reclaim at the end of a connection.

One method some old operating systems like Apple OS 9 used before virtual memory was a thing is handles. Instead of a memory pointer, allocation returned a handle. That handle was a pointer to the real object in memory. When the operating system needed to compact memory or swap it to disk it would change the handle.

I have actually implemented a similar system in C++ using an array of handles into a shared memory map psuedo-database. When the map was compacted then the handle table was scanned for affected entries and updated.

Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
  • I hear some GC treat the heap and stack as arrays of `void*` pointers and scan them for pointers that look like valid object pointers to heap or stack (whose ranges are known to VM) and are properly aligned. They can also look for poison/tag the top unused bits of the pointer on 64-bit architectures. Such a scan requires stopping all threads first, though. – Maxim Egorushkin Oct 14 '20 at 21:22
  • 1
    @MaximEgorushkin It can be done that way. Other systems can do a trace from "roots" such as the stack or from the memory allocator. It doesn't have to scan all memory because the only possible valid pointers have to be pointed to by one of the roots. Yet another way to do it is to store arrays of pointers as they are created, and then the GC also knows where every pointer is. – Zan Lynx Oct 14 '20 at 22:30
  • Okay, makes sense. – Maxim Egorushkin Oct 15 '20 at 00:28
1

Generic memory compaction is not generally useful nor desirable because of its costs.

What may be desirable is to have no wasted/fragmented memory and that can be achieved by other methods than memory compaction.

In C++ one can come up with a different allocation approach for objects that do cause fragmentation in their specific application, e.g. double-pointers or double-indexes to allow for object relocation; object pools or arenas that prevent or minimize fragmentation. Such solutions for specific object types is superior to generic garbage collection because they employ application/business specific knowledge which allows to minimize the scope/cost of object storage maintenance and also happen at most appropriate times.

A research found that garbage collected languages require 5 times more memory to achieve performance of non-GC equivalent programs. Memory fragmentation is more severe in GC languages.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • 1
    Some caveats on that paper: it used measurements from a 1ghz PowerPC machine with tiny cache sizes (compared to modern chips) , and it does not study incremental garbage collectors. – Botje Oct 14 '20 at 11:07
  • @Botje I'd like to see a more up-to-date research. – Maxim Egorushkin Oct 14 '20 at 11:17
  • Me too. I browsed the "cited by" pages in Google Scholar for a bit and the author's dblp entry, but could not find an immediate follow-up. – Botje Oct 14 '20 at 11:32