4

I need to initialize a vector for use as a buffer. I don't care what values it contains before I put something in it, so I don't want the program to waste time filling it with zeroes. I know about with_capacity, but it requires me to push() elements, which is inconvenient because I would need to constantly check if I have pushed something to an index before or not.

Basically, I'm looking for an equivalent of this C++ array:

int* arr = new int[size];
arr[2]; // random garbage
splaytreez
  • 552
  • 5
  • 13
  • Reading unititialized memory is currently always considered UB, unless the type of the target value is `MaybeUninit`. What do you expect to do with these values? Or you want to preallocate the vector and then fill it non-sequentially? – Cerberus Apr 24 '22 at 15:21
  • It's not critical at all for me, I'm just solving a simple question. But it got me wondering because it's something I would do without hesitation in c++ and didn't expect that to be a problem. – splaytreez Apr 24 '22 at 15:33
  • @splaytreez you should definitely hesitate doing this in C++ because [reading from an uninitialized variable is undefined behavior](https://stackoverflow.com/questions/30180417/what-happens-when-i-print-an-uninitialized-variable-in-c) and should be avoided at all costs. – kmdreko Apr 24 '22 at 16:18
  • @kmedreko The algorithm worked in a way that an index is always initialized by the time I read from it – splaytreez Apr 25 '22 at 10:58

3 Answers3

7

If you cannot fill the vector for performance reasons (but please benchmark!), you need to use unsafe code. You can e.g. use Vec::spare_capacity_mut() (or the nightly Vec::split_at_spare_mut()) to access the elements, then Vec::set_len(). For example:

let mut v = Vec::with_capacity(N);
for (i, item) in v.spare_capacity_mut().iter_mut().enumerate() {
    item.write(i);
}
// SAFETY: All elements were initialized.
unsafe {
    v.set_len(N);
}

Playground.

Or you can manually manage the pointers with as_mut_ptr() (less recommended, but may be better if some of the elements are already initialized):

let mut v = Vec::<i32>::with_capacity(N);
// SFAETY: There are enough elements (even if uninitialized).
let all_elements = unsafe {
    std::slice::from_raw_parts_mut(v.as_mut_ptr().cast::<std::mem::MaybeUninit<_>>(), N)
};
for (i, item) in all_elements.iter_mut().enumerate() {
    item.write(i);
}
// SAFETY: All elements were initialized.
unsafe {
    v.set_len(N);
}

Playground.

Chayim Friedman
  • 47,971
  • 5
  • 48
  • 77
4

This is hard and, to some extent, anti-idiomatic, because Rust will try to prevent you manipulating uninitialized memory by any means. And, even though it sounds like a limitation if you come from C++, you have to realize that everything does not translate literally from C++ to Rust, even though the two languages share a lot.

For instance, if you:

  • don't want to fill a vector with default values,
  • don't want push values while going on,

then you may be better off building an iterable, and collecting it.

You could also fill the vector with Nones.

In particular, Rust incentivizes you not to leave the possibility of a variable pointing to uninitialized memory. That is, even if you achieve (as I will explain later) to have uninitialized memory, Rust will require you to work with unsafe blocks when you access it, until you have "certified" it's all initialized, and even then you are never allowed to read from uninitialized memory, which results in UB (which is the worst could happen).

If you really want to work with uninitialized memory, then read the relevant chapter of the rustnomicon. In the end, what you will use is MaybeUninit. However, read carefully the rustnomicon too (not only the doc page of MaybeUninit), as it contains subtleties.

jthulhu
  • 7,223
  • 2
  • 16
  • 33
3

If your goal is to store for example integer values, you can efficiently use the vec macro to initialize the vector:

let vec = vec![0; 5];
assert_eq!(vec, [0, 0, 0, 0, 0]);

This form of macro invocation takes two parameters. The first one is the value of the element to insert. The second is the number of copies of the element to insert.

It is equivalent to:

let mut vec = Vec::with_capacity(5);
vec.resize(5, 0);

See https://doc.rust-lang.org/std/vec/struct.Vec.html for more details.

Note that when you say:

waste time filling it with zeroes

you probably want to quantify this. Note that it appears that the operation I described above is O(n) in the general case (see https://doc.rust-lang.org/src/alloc/vec/mod.rs.html#2315) so it might not fit your use case.

If however your vector contains integer values, there is a good chance the operation will be O(1) as per the explanation from this answer. As said elsewhere, please benchmark first.

SirDarius
  • 41,440
  • 8
  • 86
  • 100
  • "Note that it appears that the operation I described above is O(n)" Yeah, that's why I was looking for another way. It's not something critical, I was just solving a simple question and got curious – splaytreez Apr 24 '22 at 15:46
  • I was somewhat hoping initialisation was O(1) thanks to some calloc-like allocation tricks, such as asking zeroed pages from the OS. I haven't found anything conclusive so far. – SirDarius Apr 24 '22 at 16:31
  • 1
    @splaytreez this is not as inefficient as it looks like (at least, not on all OS). See [this post](https://stackoverflow.com/questions/71946484/why-is-vec0-super-large-number-so-memory-efficient-when-the-default-value) to understand why. – jthulhu Apr 25 '22 at 05:32
  • @BlackBeans Thanks for the link. I was highly suspecting something like this existed! I need to amend my answer then. – SirDarius Apr 25 '22 at 08:15