6

been having this discussion with a colleague. when languages such as c# or java garbage collect objects such as strings, returning them back to the heap, do they also clean out this memory block, for example overwriting with 0's or 1's?

my assumption is that the block is returned as is, unless one uses classes such as securestring with finalize overload, to 0 out the block.

rvh
  • 135
  • 1
  • 7
  • it would be *logic*, for me at least, for zeroing out, for security reasons. – Eugene Apr 15 '14 at 14:53
  • Related: http://stackoverflow.com/questions/6441218/can-a-local-variables-memory-be-accessed-outside-its-scope/6445794#6445794 – Servy Apr 15 '14 at 15:22
  • 1
    Usually languages implementations favor performance. Hence, they only zero out memory when they have to, and given the choice to do it when garbage collecting *or* when allocating, they chose when allocating - simply because they follow the "do as little work as possible" principle. – Durandal Apr 15 '14 at 17:18

2 Answers2

3

Practically speaking, no, this doesn't happen. Overwriting memory you've just freed takes time, so there are performance penalties. "Secure" objects like SecureString are just wiping themselves, not relying on the GC.

More broadly, it depends very much on that particular implementation of that particular language. Every language that assumes the existence of a GC (like C#) specifies different rules about how and when garbage collection should happen.

To take your C# example, the C# specification does not require that objects be overwritten after being freed, and it doesn't forbid it either:

Finally, at some time after the object becomes eligible for collection, the garbage collector frees the memory associated with that object.

§3.9 C# 5.0 Language Specification

If the memory is later assigned to a reference type, you'll have a constructor that does your own custom initialization. If the memory is later assigned to a value type, it gets zeroed out before you can start reading from it:

Initialization to default values is typically done by having the memory manager or garbage collector initialize memory to all-bits-zero before it is allocated for use. For this reason, it is convenient to use all-bits-zero to represent the null reference.

§5.2 C# 5.0 Language Specification

Additionally, there's at least two implementations of C# -- Microsoft's implementation and Mono's implementation, so just saying "C#" isn't specific enough. Each implementation might decide to overwrite memory (or not).

John Feminella
  • 303,634
  • 46
  • 339
  • 357
  • probably a stupid question, but what if you freed two bytes A and B, and B has some ones (some info). GC reclaims memory, and these two bytes are given to someone else, that only writes to A byte. How will that be handled? Doesn't that induce stale data? – Eugene Apr 15 '14 at 14:59
  • @Eugene: Assuming that were possible in a hypothetical language, that would generally be considered a programming error. You should always initialize data before you read from it. – John Feminella Apr 15 '14 at 15:07
  • @Eugene Outside of `unsafe` code, in C# you *can't* read memory that hasn't been allocated for you. It goes out of its way to make sure there *is no way* to read the memory after it is freed. If the memory is given over to some code not written in C#, and it decides to read memory allocated for it without first clearing/setting it, it can. There is no defined behavior for what it will see in such cases. – Servy Apr 15 '14 at 15:21
  • @Servy: Because `out` parameters are simply `ref` parameters decorated with an attribute that isn't universally honored, a virtual method with an `out` parameter may be overridden by code, written in another language, which returns without storing anything there. If a struct constructor, written in C#, passes the struct under construction as an `out` parameter to such a method, it's possible that `foo = new MyStruct(1,2);` might leave some or all fields of `foo` unmodified. – supercat Apr 15 '14 at 16:25
  • Given that zeroing out memory wholesale is cheaper than zeroing it out piecemeal, and a lot of newly-allocated memory will need to be zero, I would think it would be cheaper to have the GC bulk-zero freed memory, and know that it had done so, saving the need to zero out bits and pieces of it later. – supercat Apr 15 '14 at 16:28
  • @supercat This is speculation on my part, but I would think that when you do a GC cycle, you really need it to be quick because you're probably running out of memory, so it's better to defer memory initialization closer to the point where you actually instantiate objects. – John Feminella Apr 15 '14 at 19:11
  • @supercat: That's surely true, but zeroing the memory means bringing it in cache (and probably evicting more important data). Maybe doing it in blocks of a few KiB might be best. OTOH it makes sense to do it using otherwise idle cores. – maaartinus Apr 15 '14 at 19:26
  • @maaartinus: From my understanding, many processors include mechanisms to explicitly read and write a cache line of data at a time; this is part of why I would expect bulk-zero operations to be faster than piecemeal, though a lot could depend upon the memory controller. – supercat Apr 15 '14 at 19:31
2

To the extend of my knowledge there's not a single garbage collector who actually wipes memory with 0's or any number at all. C# and Java garbage collectors reclaim memory from unused objects and mark it as available. SecureString wipes itself at finalization but that is not a GC thing.

yorodm
  • 4,359
  • 24
  • 32