1

I have been testing on fastest methods to create and initialize 2d matrix in rust.

Methods I tried (Initialize 1000x1000 2d array):

Method Time
Array2::<i32>::zeros((width, height)) (using ndarray package) 47μs
vec![vec![0; height]; width] 8.0649ms
[[u32; width]; height] = [[0; width]; height] 301.4µs

How ndarray package can initialize 2d array so fast? Can we achieve same speed using primitives type?

Herohtar
  • 5,347
  • 4
  • 31
  • 41
  • `vec!` does not produce arrays, but vectors, which are (slightly) more expensive. – jthulhu Apr 21 '22 at 04:57
  • Now try `vec![0; width * height]`, which should get very close to the ndarray timing, if not exactly the same. See also [this answer I just wrote today](https://stackoverflow.com/a/71946892/501250) on why creating a zero-initialized Vec of a numeric type is so fast. The bottom two examples you give require initialization of at least the outer container layer as the element type is another container. – cdhowie Apr 21 '22 at 06:05
  • @BlackBeans The important bit is that vector data is allocated on the heap -- but so are ndarray values, so this doesn't really explain anything. The first two methods given _both_ allocate on the heap (though the second one has `width + 1` allocations). – cdhowie Apr 21 '22 at 06:10
  • @cdhowie indeed, you are right. In fact, I looked up what `Array2::zeros` does, and it (in the end) calls `vec![0; width * height]`, among other things. So `vec![0; width*height]` should be at least as fast as `Array::zeros`. – jthulhu Apr 21 '22 at 06:14
  • Converted my comment to a proper answer. – cdhowie Apr 21 '22 at 06:19

1 Answers1

1

Because under the hood it's allocating a one-dimensional vector. Try this instead:

vec![0; width * height]

A zero-initialized vector of a numeric type can be created very quickly because initialization isn't necessary. However, your second method allocates width + 1 vectors, and requires initialization of the outer container. This is not only needlessly wasteful in terms of memory consumption, but will likely cause poor data locality and possible heap fragmentation as nothing requires the heap allocations for the inner vectors to be near to each other. The performance hit here is almost certainly due to having to perform 1,000 extra heap allocations.

The third method creates an array of arrays. This lays out the data contiguously, so is better than the second method, but with this approach Rust can't skip the initialization step like it can when allocating from the heap, and so it takes a bit longer.

cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • Thank you for your answer, so the reason ndarra/1d vec are faster than array is because it skip the initialization step. And I just run another exp for comparing initialization. The second and third method are significantly faster. PS: what will be a more percise way to mesure? Currently I am using Instant::now() and elapsed – Kelvin_neverKnow Apr 21 '22 at 08:06
  • For benchmarking, particularly micro-benchmarking, criterion (https://bheisler.github.io/criterion.rs/book/index.html) is the usual recommendation. – PossiblyAShrub Apr 22 '22 at 21:35