Performance comparison of a Vec and a boxed slice

Question

I want a function to

allocate a basic variable-length "array" (in the generic sense of the word, not necessarily the Rust type) of floats on the heap
initialize it with values
implement Drop, so I don't have to worry about freeing memory
implement something for indexing or iterating

The obvious choice is Vec, but how does it compare to a boxed slice on the heap? Vec is more powerful, but I need the array for numerical math and, in my case, don't need stuff like push/pop. The idea is to have something with less features, but faster.

Below I have two versions of a "linspace" function (a la Matlab and numpy),

"linspace_vec" (see listing below) uses Vec
"linspace_boxed_slice" (see listing below) uses a boxed slice

Both are used like

let y = linspace_*(start, stop, len);

where y is a linearly spaced "array" (i.e. a Vec in (1) and a boxed slice in (2)) of length len.

For small "arrays" of length 1000, (1) is FASTER. For large arrays of length 4*10^6, (1) is SLOWER. Why is that? Am I doing something wrong in (2)?

When the argument len = 1000, benchmarking by just calling the function results in

(1) ... bench: 879 ns/iter (+/- 12)
(2) ... bench: 1,295 ns/iter (+/- 38)

When the argument len = 4000000, benchmarking results in

(1) ... bench: 5,802,836 ns/iter (+/- 90,209)
(2) ... bench: 4,767,234 ns/iter (+/- 121,596)

Listing of (1):

pub fn linspace_vec<'a, T: 'a>(start: T, stop: T, len: usize) -> Vec<T>
where
    T: Float,
{
    // get 0, 1 and the increment dx as T
    let (one, zero, dx) = get_values_as_type_t::<T>(start, stop, len);
    let mut v = vec![zero; len];
    let mut c = zero;
    let ptr: *mut T = v.as_mut_ptr();
    unsafe {
        for ii in 0..len {
            let x = ptr.offset((ii as isize));
            *x = start + c * dx;
            c = c + one;
        }
    }

    return v;
}

Listing of (2):

pub fn linspace_boxed_slice<'a, T: 'a>(start: T, stop: T, len: usize) -> Box<&'a mut [T]>
where
    T: Float,
{
    let (one, zero, dx) = get_values_as_type_t::<T>(start, stop, len);
    let size = len * mem::size_of::<T>();
    unsafe {
        let ptr = heap::allocate(size, align_of::<T>()) as *mut T;
        let mut c = zero;
        for ii in 0..len {
            let x = ptr.offset((ii as isize));
            *x = start + c * dx;
            c = c + one;
        }
        // IS THIS WHAT MAKES IT SLOW?:
        let sl = slice::from_raw_parts_mut(ptr, len);
        return Box::new(sl);
    }
}

If I understand the function, don't think you need to drop to that level to go fast. Iterators are pretty awesome: [bench: 2,755,606 ns/iter (+/- 408,352)](http://is.gd/ePFynw). — Shepmaster, Nov 14 '15 at 03:18

score 41 · Accepted Answer · edited Dec 31 '18 at 15:00

41

In your second version, you use the type Box<&'a mut [T]>, which means there are two levels of indirection to reach a T, because both Box and & are pointers.

What you want instead is a Box<[T]>. I think the only sane way to construct such a value is from a Vec<T>, using the into_boxed_slice method. Note that the only benefit is that you lose the capacity field that a Vec would have. Unless you need to have a lot of these arrays in memory at the same time, the overhead is likely to be insignificant.

pub fn linspace_vec<'a, T: 'a>(start: T, stop: T, len: usize) -> Box<[T]>
where
    T: Float,
{
    // get 0, 1 and the increment dx as T
    let (one, zero, dx) = get_values_as_type_t::<T>(start, stop, len);
    let mut v = vec![zero; len].into_boxed_slice();
    let mut c = zero;
    let ptr: *mut T = v.as_mut_ptr();
    unsafe {
        for ii in 0..len {
            let x = ptr.offset((ii as isize));
            *x = start + c * dx;
            c = c + one;
        }
    }

    v
}

edited Dec 31 '18 at 15:00

Shepmaster

388,571
95
1,107
1,366

answered Nov 14 '15 at 03:02

Francis Gagné

60,274
7
180
155

2

Wow, you guys are quick in replying:) Thanks! – poidl Nov 14 '15 at 03:14
3

There is another way to create `Box<[T]>`, from a boxed fixed size array (`Box<[T; N]>`), it can be cast or it will simply coerce. – bluss Nov 14 '15 at 05:06
@bluss: but can I make a [T;N] with VARIABLE N (before I box it)? – poidl Nov 14 '15 at 05:29
3

There doesn't appear to be a way to allocate a variable-length array directly in the language or in the standard library as of Rust 1.4.0, besides using `Vec`. If you want to avoid creating a `Vec`, look up how it allocates a variable-length array and how it transforms it into a `Box<[T]>` in [the source code](https://github.com/rust-lang/rust), then replicate the code in your program/library, keeping only the parts that are necessary. – Francis Gagné Nov 14 '15 at 18:13
Excellent, I'll do that. I already had a quick look at boxed.rs, especially IntermediateBox. But yeah, maybe I'll start at the other end (i.e. Vec). I also saw that in Rustonomicon, the Vec is fully explained. Plenty to do on the coming Saturdays :) – poidl Nov 14 '15 at 20:50
1

For a variable length box slice, there is now a nightly only experimental API [Box::::new_uninit_slice(len: usize)](https://doc.rust-lang.org/std/boxed/struct.Box.html#method.new_uninit_slice). I tried using it to cut down on WASM binary size for a pathetic result: 8624B to 8317B... – Chinoto Vokro Dec 10 '19 at 04:49
You might say that another benefit to converting to a boxed slice is that you eliminate the possibility of entries being added/removed (when that would be undesirable) while still retaining mutability for the existing entries. – AmigoNico Apr 08 '21 at 21:28
1

since 1.32 there is also `FromIterator for Box<[T]>`, so you can collect directly into one. – matt Sep 07 '22 at 06:37

Performance comparison of a Vec and a boxed slice

1 Answers1

Linked

Related