17

So if I understand well, Garbage collection automatically deallocates objects that are not used by the program anymore. like the garbage collector in java.

I hear in languages like C that don't support garbage collection the programs can have memory leaks and subsequently exhaust the memory.

So what are the errors that programmer make in languages like C that don't support garbage collection? I would guess not deallocating objects after they're not used anymore. But are these the only errors that we can make because of the lack of a garbage collector?

Rex M
  • 142,167
  • 33
  • 283
  • 313
user69514
  • 26,935
  • 59
  • 154
  • 188
  • 1
    You can still have leaks in a garbage collected environment in the form of "over-rooted objects". That is, objects that are still strongly referenced that you no longer need. – bbum Sep 15 '09 at 16:49

11 Answers11

19
  • Dellocating things you need

  • Not deallocating things you no longer need (because you're not tracking allocations/use/frees well)

  • Re-allocating new instances of things that already exist (a side-effect of not tracking properly)

  • De-allocating something you've already freed

  • De-allocating something that doesn't exist (a null pointer)

There are probably more. The point is: managing memory is tricky, and is best dealt with using some sort of tracking mechanism and allocating/freeing abstraction. As such, you might as well have that built into your language, so it can make it nice and easy for you. Manual memory management isn't the end of the world -- it's certainly doable -- but these days, unless you're writing real-time code, hardware drivers, or (maybe, possibly) the ultra-optimised core code of the latest game, then manual effort isn't worth it, except as an academic exercise.

Lee B
  • 2,137
  • 12
  • 16
  • 1
    Oh, one more issue: manually managing array length (resizing the array as you load more items to put into it etc.) is tedious at best. Increasing by one item each time is inefficient, so you tend to have to start tracking slots used, actual slots allocated, slots needed, etc. And you really don't want to have to start de-allocating stuff from the middle of that array and then shrinking it. Compared to a modern language like python's effortless arrays and dicts, it's really low-level stuff. – Lee B Sep 15 '09 at 02:05
  • 9
    In reference to your last point. De-allocating NULL is completely fine. `free(NULL)` is guaranteed to be completely safe by the standard. – Evan Teran Sep 15 '09 at 03:18
  • 1
    Low level languages like C are for a LOT more than just ultra-optimised code or drivers. Some things are just more efficiently done in C and it's still for a reason the most popular language ( at least for OSS projects ). – Kasper Oct 03 '09 at 10:03
  • 7
    @kmm: Of course, but then all of those things are more efficiently done in hand-crafted assembly that bypasses the OS, by experts who know the processor, ram, chipset, hard drive geometry, etc. inside-out. The question is not whether it's more efficient, but whether it's reasonable to spend the extra time tracking all that, for relatively small gains in efficiency, given the option of more rapid development in a higher-level tool. – Lee B Oct 06 '09 at 10:39
13

IMO, garbage collected languages have complementary problems to those in non-garbage-collected languages. For every issue, there is a non-GC-characteristic bug and a GC-characteristic bug - a non-GC programmer responsibility and a GC programmer responsibility.

GC programmers may believe that they are relieved of responsibility for freeing objects, but objects hold resources other than memory - resources that often need to be released in a timely way so that they can be acquired elsewhere - e.g. file handles, record locks, mutexes...

Where a non-GC programmer would have a dangling reference (and very often one that isn't a bug, since some flag or other state would mark it as not to be used), a GC programmer has a memory leak. Thus where the non-GC programmer is responsible for ensuring that free/delete is called appropriately, a GC programmer is responsible for ensuring that unwanted references are nulled or otherwise disposed of appropriately.

There is a claim in here that smart pointers don't deal with garbage cycles. This need not be true - there are reference counting schemes that can break cycles and which also ensure timely disposal of garbage memory, and at least one Java implementation used (and may still do) a reference counting scheme that could just as easily be implemented as a smart pointer scheme in C++.

Concurrent Cycle Collection in Reference Counted Systems

Of course this isn't normally done - partly because you may as well just use a GC language, but also partly IMO because it would break key conventions in C++. You see, lots of C++ code - including the standard library - relies heavily on the Resource Allocation Is Initialisation (RAII) convention, and that relies on reliable and timely destructor calls. In any GC that copes with cycles, you simply cannot have that. When breaking a garbage cycle, you cannot know which destructor to call first without any dependency issues - it may not even be possible, since there may be more cyclic dependencies than just memory references. The solution - in Java etc, there is no guarantee that finalizers will be called. Garbage collection only collects one very specific kind of garbage - memory. All other resources must be cleaned up manually, as they would have been in Pascal or C, and without the advantage of reliable C++-style destructors.

End result - a lot of cleanup that gets "automated" in C++ has to be done manually in Java, C# etc. Of course "automated" needs the quotes because the programmer is responsible for ensuring that delete is called appropriately for any heap-allocated objects - but then in GC languages, there are different but complementary programmer responsibilities. Either way, if the programmer fails to handle those responsibilities correctly, you get bugs.

[EDIT - there are cases where Java, C# etc obviously do reliable (if not necessarily timely) cleanup, and files are an example of this. These are objects where reference cycles cannot happen - either because (1) they don't contain references at all, (2) there's some static proof that the references it contains cannot directly or indirectly lead back to another object of the same type, or (3) the run-time logic ensures that while chains/trees/whatever may be possible cycles are not. Cases (1) and (2) are extremely common for resource-managing objects as opposed to data-structure nodes - perhaps universal. The compiler itself cannot reasonably guarantee (3), though. So while standard library developers, who write the most important resource classes, can ensure reliable cleanup for those, the general rule is still that reliable cleanup of non-memory resources cannot be guaranteed for a GC, and this could affect application-defined resources.]

Frankly, switching from non-GC to GC (or visa versa) is no magic wand. It may make the usual suspect problems go away, but that just means you need new skillsets to prevent (and debug) an whole new set of suspects.

A good programmer should get past the whos-side-are-you-on BS and learn to handle both.

  • I wish framework designers would facilitate deterministic lifetime management for entities with clear owners while using GC for owner-less objects. Note that even entities with owners should be wrapped in GC-collected objects to ensure that as long as a reference to a dead object exists, it will continue to be a reference to that same dead object. GC has huge advantages for ensuring program safety and correctness, but in general the correct use of mutable objects, *even those without resources*, requires that one have a clear concept of ownership. Something like a `List` may not need... – supercat Sep 13 '13 at 17:58
  • ...to be "disposed", but if code holds a reference to a `List` where it's expecting some other object to put something, and that other object has abandoned that list and started using another, such a condition might be more readily detected if the object which had owned the list had invalidated it before abandoning it. – supercat Sep 13 '13 at 18:00
8

Well, the errors you can make are:

  • Not deallocating things you don't need
  • Deallocating things you do need

There are other errors you can make, but those are the ones that relate specifically to garbage collection.

Noon Silk
  • 54,084
  • 6
  • 88
  • 105
3

In addition to what silky says you can also double deallocate something.

Alex Gaynor
  • 14,353
  • 9
  • 63
  • 113
2

In C, you have to manually call free on memory allocated with malloc. While this doesn't sound so bad, it can get very messy when dealing with separate data structures (like linked lists) that point to the same data. You could end up accessing freed memory or double-freeing memory, both of which cause errors and can introduce security vulnerabilities.

Additionally, in C++, you need to be careful of mixing new[]/delete and new/delete[].

For example, memory management is something that requires the programmer to know exactly why

const char *getstr() { return "Hello, world!" }

is just fine but

const char *getstr() {
    char x[BUF_SIZE];
    fgets(x, BUF_SIZE, stdin);
    return x;
}

is a very bad thing.

Sriram Sakthivel
  • 72,067
  • 7
  • 111
  • 189
Andrew Keeton
  • 22,195
  • 6
  • 45
  • 72
  • 2
    Keep in mind any mature C++ programmer shudders at the use of raw memory. Wrap it up, do away with it. – GManNickG Sep 15 '09 at 02:58
2

In addition to other comments, manual memory management makes certain high performance concurrent algorithms more difficult.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
  • 1
    Garbage collection makes certain other high performance concurrent algorithms more difficult, though the performance issues tend to get blamed on the garbage collector which is having to search for garbage that would have been trivially handled using memory management. Religious wars aside, it really is swings and roundabouts. –  Sep 30 '09 at 10:28
  • oops - that's "... trivially handled using *manual* memory management" of course. –  Sep 30 '09 at 10:29
2

Some non-GC languages offer constructs called reference counting smart pointers. These try to get around some problems such forgetting to deallocate memory or trying to access invalid memory by automating some of the management functions.

As some have said, you have to be "smart" about "smart pointers". Smart pointers help to avoid a whole class of problems, but introduce their own class of problems.

Many smart pointers can create memory leaks by:

  • cycles or circular reference (A points to B, B points to A).
  • bugs in the smart pointer implementation (rare on mature libraries like Boost)
  • mixing raw pointers with smart pointers
  • thread safety
  • improperly attached or detaching from a raw pointer

These problems shouldn't be encountered in fully GC'ed environments.

James Schek
  • 17,844
  • 7
  • 51
  • 64
0

Another common error is reading or writing memory after you've deallocate it (memory which has since been reallocated and is now being used for something else, or memory which hasn't been realoocated yet and which is therefore currently still own by the heap manager and not by your application).

ChrisW
  • 54,973
  • 13
  • 116
  • 224
0

Usually, languages with Garbage Collection restrict the programmer's access to memory, and rely on a memory model where objects contain:

  • reference counters - the GC uses this to know when an object is unused, and
  • type and size information - to eliminate buffer overruns (and help reduce other bugs.

In comparison with a non-GC language, there are two classes of errors that are reduced/eliminated by the model and the restricted access:

  1. Memory Model errors, such as:

    • memory leaks (failure to deallocate when done),
    • freeing memory more than once,
    • freeing memory that was not allocated (like global or stack variables),
  2. Pointer errors, such as:

    • Uninitialized pointer, with "left over" bits from previous use,
    • Accessing, especially writing to, memory after freeing (nasty!)
    • Buffer overrun errors,
    • Use of memory as wrong type (by casting).

There are more, but those are the big ones.

NVRAM
  • 6,947
  • 10
  • 41
  • 44
0

memory leaks in C++ are not inherent to the language, they are due to programming errors. Valgrind is a wonderful tool for finding those errors which makes it easy to fix them. Smart pointers are in built tool of the language to help auto delete stuff when it goes out of scope, but circular references can prevent the last reference from a parent to member and that member to its parent from being deleted. point is if you avoid using smart pointers for circular references (use RAW pointers or "&" refernces for those) and you run your C++ code through valgrind you won't have memory leaks. and your code will probably be about 2x faster than Java without having done anything special. This is not to advocate C++ over Java, Java has other features like reflection that are useful. besides language features the decided factor between Java an C++ is generally whether developer time or computing resources are more "expensive" under any metric you care to use. Java is the industrial strength application/architecture language. C++ is the performant compute language that can be called from just about any other language out there. The project I'm working on is largely a Java architecture with some C++ components (legacy code) and stuff that needs to be performant (I'm the C++ guy on the team writing the math intensive code that needs to be performant)

Keith
  • 71
  • 1
  • 7
-3

Please don't compare OO languages (Java, C#) with non OO languages (C) when talking about garbage collection. OO languages (mostly) allow you to implement GC (See comment about smart pointers). Yes they are not easy but they help a lot, and they are deterministic.

Also, how do GC-languages compare to non GC-languages when considering resources other than memory, eg. files, network connections, DB connections, etc...

I think answering that question, left to the reader, will shed some light on things too.

Richard
  • 192
  • 6
  • 2
    There's GC for C as well (check out Boehm GC). OO vs. non-OO is not particularly important. Pretty much all functional languages have GC, whether or not it's a functional-OO hybrid. Also, smart pointers are kind of poor man's GC. – Chuck Sep 15 '09 at 03:30
  • Specifically, smart pointer schemes don't deal with garbage cycles. – Stephen C Sep 15 '09 at 05:41
  • Did forget about functional languages. Thanks for the tip. – Richard Sep 16 '09 at 00:57
  • 1
    SP != GC. GC generally involves an active process of findng and reclaiming of unused memory. SP is a passive mechanism that extends RAII semantics to the heap. They are both forms of automatic memory management, but garbage collection is distinctly different than smart pointers. The regular "pointers" (i.e. referenes) in Java are not "smart" in any way like C++ smart pointers. – James Schek Sep 28 '09 at 16:37