147

I understand Rust doesn't have a garbage collector and am wondering how memory is freed up when a binding goes out of scope.

So in this example, I understand that Rust reclaims the memory allocated to a when it goes out of scope.

{
    let a = 4
}

The problem I am having with this, is firstly how this happens, and secondly isn't this a sort of garbage collection? How does it differ from typical garbage collection?

Penny Liu
  • 15,447
  • 5
  • 79
  • 98
rix
  • 10,104
  • 14
  • 65
  • 92
  • 18
    "Deterministic object lifetimes". Similar as C++. – user2864740 Sep 20 '15 at 08:41
  • @user2864740 That guide is well out of date. The modern replacement would probably be http://doc.rust-lang.org/book/references-and-borrowing.html. – Veedrac Sep 20 '15 at 21:12
  • Rust **is** garbage collected, like any other practical programming language. It's just that [everybody thinks about garbage collection the wrong way](https://devblogs.microsoft.com/oldnewthing/20100809-00/?p=13203). – IInspectable Feb 06 '22 at 08:16

4 Answers4

114

Garbage collection is typically used periodically or on demand, like if the heap is close to full or above some threshold. It then looks for unused variables and frees their memory, depending on the algorithm.

Rust would know when the variable gets out of scope or its lifetime ends at compile time and thus insert the corresponding LLVM/assembly instructions to free the memory.

Rust also allows some kind of garbage collection, like atomic reference counting though.

Ayonix
  • 1,872
  • 1
  • 15
  • 14
  • By allocating memory when introducing variables and freeing memory when the memory is no longer needed? I don't really know what you want to say with that. Maybe we have different opinions on what a GC is then. – Ayonix Sep 20 '15 at 09:06
  • 1
    His question is how Rust's approach differs from a typical GC. So I explained what a GC is and how Rust does it without a GC. – Ayonix Sep 20 '15 at 09:08
  • Was I wrong in thinking that if you create an object simply with 'let', it would be created on the stack? And no garbage collection would be necesary? If a doesn't allocate something in it's ctor, of course. – Amomum Sep 21 '15 at 20:47
  • 1
    https://doc.rust-lang.org/book/the-stack-and-the-heap.html explains it pretty well. Yes, many things are in the stack but let alone is no sufficient indicator (see Box). I left that out for the sake of simplicity, since the question was asking generally though – Ayonix Sep 21 '15 at 21:04
  • @Ayonix thanks, I just wanted to check my sanity, that in Rust we need explicitly call some kind of new() to allocate on heap. – Amomum Sep 22 '15 at 08:27
  • 1
    @Amomum Actually Rust doesn't have any anointed `new()` function like C, they are just static functions, and in particular something like `let x = MyStruct::new()` creates its object on the stack. The *real* indicator of heap allocation is `Box::new()` (or any of the structures that depend on Box). – Mario Carneiro Jul 13 '16 at 09:34
  • 1
    What other languages handle memory management in a similar way to Rust? – still_dreaming_1 Mar 24 '17 at 19:03
  • @still_dreaming_1 I don't think any other languages do. This novel approach to memory management is kind of Rust's specialty. Although Zig doesn't have a GC either and has some safety guarantees: https://ziglang.org/ – J Winnie Aug 28 '20 at 02:04
64

The basic idea of managing resources (including memory) in a program, whatever the strategy, is that the resources tied to unreachable "objects" can be reclaimed. Beyond memory, those resources can be mutex locks, file handles, sockets, database connections...

Languages with a garbage collector periodically scan the memory (one way or another) to find unused objects, release the resources associated with them, and finally release the memory used by those objects.

Rust does not have a GC, how does it manage?

Rust has ownership. Using an affine type system, it tracks which variable is still holding onto an object and, when such a variable goes out of scope, calls its destructor. You can see the affine type system in effect pretty easily:

fn main() {
    let s: String = "Hello, World!".into();
    let t = s;
    println!("{}", s);
}

Yields:

<anon>:4:24: 4:25 error: use of moved value: `s` [E0382]
<anon>:4         println!("{}", s);

<anon>:3:13: 3:14 note: `s` moved here because it has type `collections::string::String`, which is moved by default
<anon>:3         let t = s;
                     ^

which perfectly illustrates that at any point in time, at the language level, the ownership is tracked.

This ownership works recursively: if you have a Vec<String> (i.e., a dynamic array of strings), then each String is owned by the Vec which itself is owned by a variable or another object, etc... thus, when a variable goes out of scope, it recursively frees up all resources it held, even indirectly. In the case of the Vec<String> this means:

  1. Releasing the memory buffer associated to each String
  2. Releasing the memory buffer associated to the Vec itself

Thus, thanks to the ownership tracking, the lifetime of ALL the program objects is strictly tied to one (or several) function variables, which will ultimately go out of scope (when the block they belong to ends).

Note: this is a bit optimistic, using reference counting (Rc or Arc) it is possible to form cycles of references and thus cause memory leaks, in which case the resources tied to the cycle might never be released.

Matthias Braun
  • 32,039
  • 22
  • 142
  • 171
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • 3
    "Languages with a Garbage Collector periodically scan the memory (one way or another)". Many do but that is not true in general. Real-time garbage collectors scan incrementally rather than periodically. Reference counting languages like Mathematica don't scan at all. – J D Jun 04 '16 at 22:50
  • @JonHarrop: I do not count reference-counting as a complete Garbage Collection mechanism since it must be supplemented to avoid leaking cycles. As for the incremental/periodic difference, it may my poor command of English, but I fail to see how periodic does not cover the incremental case... I think that the "(one way or another)" bit adequately conveys that many varied approaches exist. In any case, if you have a better way of succinctly describing Garbage Collection, please suggest away. I have, however, no intention of launching myself in a full-blown explanation: I am unqualified for it. – Matthieu M. Jun 05 '16 at 10:42
  • 6
    "I do not count reference-counting as a complete Garbage Collection mechanism since it must be supplemented to avoid leaking cycles". RC is conventionally regarded as a form of GC. In Mathematica and Erlang, for example, cycles cannot be created by design so RC does not leak. For a high-level perspective, see "A unified theory of garbage collection" https://www.cs.virginia.edu/~cs415/reading/bacon-garbage.pdf – J D Jun 05 '16 at 16:24
  • @JonHarrop: True, if no cycle is possible then RC cannot leak. – Matthieu M. Jun 05 '16 at 17:41
  • 3
    "I fail to see how periodic does not cover the incremental case". Stop the world algorithms would be regarded as periodic whereas tricolor marking is regarded as incremental, for example. They are opposites in this context. – J D Jun 07 '16 at 20:14
  • 2
    @JD You're going way too deep. His explanation doesn't have to do with the internals of how GCs work, only the differences between GC and non-GC languages. The differentiation that you're trying to make is based on the implementation of GCs themselves. The differentiation that he's trying to make is between GCs in the abstract. There's no need to delve 500 words into the semantic meaning of "periodic" in this context. – Nicholas Pipitone Sep 16 '20 at 21:29
  • 1
    Abstractly, we normally consider langauges like C++/Rust that use RAII/RC as non-garbage-collecting. And languages such as Java/Python/C# to be garbage collecting (Even if it uses RC as an underlying implementation). The core difference is that in C++/Rust, the RC is explicit, and it's virtually a 5-line wrapper around calling malloc and free yourself. When in a GC language, it's abstracted from view, and classes are passed by reference rather than by value. (And the language specification rarely mentioned whether or not its RC or Mark-and-sweep, that's normally an implementation detail) – Nicholas Pipitone Sep 16 '20 at 21:35
  • @JD The link is broken. Can you give a documentation etc. on how the design of those languages makes it impossible to create a reference cycle? – Ekrem Dinçel Jan 10 '21 at 13:15
  • Here's a new link to the "Unified Theory of Garbage Collection" paper https://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon04Unified.pdf – J D Jan 11 '21 at 11:30
  • For Erlang's design I recommend finding an Erlang group and asking them for the best references. – J D Jan 11 '21 at 11:39
7

With a language where you must manually manage memory, the distinction between the stack and the heap becomes critical. Every time you call a function, enough space is allocated on the stack for all variables contained within the scope of that function. When the function returns, the stack frame associated with that function is "popped" off the stack, and the memory is freed for future use.

From a practical standpoint, this inadvertent memory cleaning is used as a means of automatic memory storage that will be cleared at the end of the function's scope.

There is more information available here: https://doc.rust-lang.org/book/the-stack-and-the-heap.html

Swiss
  • 5,556
  • 1
  • 28
  • 42
  • 5
    While using the stack is handy, deterministic object lifetimes can still be handled if all values were 'created on the heap'. Thus it is an implementation detail; not necessarily a language strategy. – user2864740 Sep 20 '15 at 08:47
  • 2
    You keep using that word. I do not think it means what you think it means. – Swiss Sep 20 '15 at 08:51
  • Means [what I wish to express](http://programmers.stackexchange.com/questions/240767/why-is-there-no-deterministic-object-destruction-in-java); being the opposite of non-deterministic lifetimes. Make an offer for a better phrase. – user2864740 Sep 20 '15 at 08:55
  • Thanks for the answer, i've give the points to the first one simply because it was submitted first. The information is just as useful and valid. – rix Sep 20 '15 at 09:07
  • @user2864740 Deterministic object lifetimes refers to being able to tell exactly when the object's memory will be cleared once its destructor has been called. It has nothing to do with how that destructor is called in the first place. You keep bringing up the same term repeatedly even though it has no direct significance to the question. – Swiss Sep 20 '15 at 09:09
  • I wish people forgot that there was any stack, because, like, most of what is "on the stack" is not on the stack at all. So it's silly. I have several 50k LOC C code projects that use automatic local variables, and the only things that go on the stack are return addresses. In fact, there's a special verification step before linking that terminates the build if anything but return addresses ever reaches the stack. So, in general, stack is not only an abstraction, but an unnecessary one. – Kuba hasn't forgotten Monica Jul 01 '20 at 16:30
  • @Kubahasn'tforgottenMonica I don't really understand your example. May you elaborate? How your automatic local variables are not in the stack? – Ekrem Dinçel Jan 10 '21 at 13:23
2

Some languages have reference counting, some have garbage collectors. Rust avoids both, instead, it allows only a single variable name or alias if you like to own a memory location at any point in time. You can move the ownership from one variable name to another, but you can’t have two variable names pointing to the same memory address (Except for shared Ownership. Rust provides the reference-counted pointer types Rc and Arc. ex: a doubly linked list).

Rust uses a relatively unique memory management approach that incorporates the idea of memory “ownership”. Basically, Rust keeps track of who can read and write to memory. It knows when the program is using memory and immediately frees the memory once it is no longer needed. It enforces memory rules at compile time, making it virtually impossible to have runtime memory bugs. You do not need to manually keep track of memory. The compiler takes care of it.

Discord recently switched from Go to Rust in one of its services just because garbage collector was causing latency. They explained very well why they did this and you will learn more about the garbage collector and rust memory system:

https://discord.com/blog/why-discord-is-switching-from-go-to-rust#:~:text=Discord%20is%20a%20product%20focused,and%20messages%20you%20have%20read.

Yilmaz
  • 35,338
  • 10
  • 157
  • 202
  • *"you can’t have two variable names pointing to the same memory address"* - I'm pretty sure that is exactly what `Rc`/`Arc` do. – IInspectable Feb 06 '22 at 08:27
  • Rc and Arc, allow values to have multiple owners, under some Restrictions. – Yilmaz Apr 30 '22 at 13:31
  • I don't see how that invalidates my previous comment. Either way, your comment is in conflict with your statement: *"Every value has a single owner"*. – IInspectable Apr 30 '22 at 14:26