7

Let's say I have these two structs

struct Node<T> {
    value: T,
    next: Box<Option<Node<T>>>
}

struct Node2<T> {
    value: T,
    next: Option<Box<Node2<T>>>
}

What's the difference in memory layout? Which one is better?

Dulguun Otgon
  • 1,925
  • 1
  • 19
  • 38
  • TLDR: that's the same thing. `next` is converted to a pointer to `Node`: https://play.integer32.com/?version=stable&mode=debug&edition=2018&gist=d41d7d3ecea770280f3fadc2da0f170a – Boiethios Nov 21 '19 at 10:07
  • 8
    There are certainly differences between the two definitions. Differences which I argue are not explicitly covered by https://stackoverflow.com/questions/16504643/what-is-the-overhead-of-rusts-option-type . (For example, there are clear reasons to prefer `Node2` (the one that does `Option>`), but it is interesting to note that `Option` is itself smaller than `Option`, because the former has a niche to store the discriminant.) – pnkfelix Nov 21 '19 at 11:03

1 Answers1

4

TL;DR: It is better to use Node2 because it avoids a heap allocation in case next is None.

For further explanation, let's simplify the question to: Is Option<Box> or Box<Option> better?

The first thing to note is that both variants only store one pointer:

type VoidPtr = *const ();
let pointer_size = mem::size_of::<VoidPtr>();
assert_eq!(mem::size_of::<Option<Box<i32>>>(), pointer_size);
assert_eq!(mem::size_of::<Box<Option<i32>>>(), pointer_size);

This is not very surprising for Box<Option>, since it actually is a pointer to an Option. But the disadvantage of using Box<Option> is that even if the value is None, the Box makes a heap allocation.

// Although the Option is None, the pointer is not null,
// so a heap allocation is done to store None.
let none_box_option: Box<Option<i32>> = Box::new(None);
let pointer = &*none_box_option as *const Option<i32>;
assert!(pointer.is_null().not());

However, it might be surprising that Option<Box> also just uses one pointer instead of, for example, a pointer plus a boolean indicating whether the contained value is Some or None. The trick used here (at least in release mode) is called null pointer optimization. A null pointer is used to indicate None. Every other value is a Some at the specified memory address. This leads to not having to make a heap allocation in case the value is None. Instead, the pointer will be set to null.

// A Option<Box> indicates None through a null pointer.
let none_option_box: Option<Box<i32>> = None;
assert!(as_bytes(&none_option_box).iter().all(|&byte| byte == 0));

// A Some Option<Box> simply holds the pointer to the inner value.
let some_option_box: Option<Box<i32>> = Some(Box::new(42));
let inner_bytes = as_bytes(&some_option_box);
let pointer = some_option_box.as_ref().unwrap().as_ref() as *const i32;
let pointer_bytes = as_bytes(&pointer);
assert_eq!(inner_bytes, pointer_bytes);

To get back to the original question: The observed behavior transfers to Node and Node2. Both need the same amount of memory.

assert_eq!(mem::size_of::<Node<()>>(), mem::size_of::<Node2<()>>());
assert_eq!(mem::size_of::<Node<u64>>(), mem::size_of::<Node2<u64>>());
assert_eq!(mem::size_of::<Node<String>>(), mem::size_of::<Node2<String>>());

However, if next is None, then Node will make a heap allocation and Node2 does not. Therefore I would advise using Node2.


Full code example

linuskmr
  • 727
  • 3
  • 13