15

In this code...

struct Test { a: i32, b: i64 }
    
fn foo() -> Box<Test> {              // Stack frame:
    let v = Test { a: 123, b: 456 }; // 12 bytes
    Box::new(v)                      // `usize` bytes (`*const T` in `Box`)
}

... as far as I understand (ignoring possible optimizations), v gets allocated on the stack and then copied to the heap, before being returned in a Box.

And this code...

fn foo() -> Box<Test> {
    Box::new(Test { a: 123, b: 456 })
}

...shouldn't be any different, presumably, since there should be a temporary variable for struct allocation (assuming compiler doesn't have any special semantics for the instantiation expression inside Box::new()).

I've found Do values in return position always get allocated in the parents stack frame or receiving Box?. Regarding my specific question, it only proposes the experimental box syntax, but mostly talks about compiler optimizations (copy elision).

So my question remains: using stable Rust, how does one allocate structs directly on the heap, without relying on compiler optimizations?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
mrnateriver
  • 1,935
  • 1
  • 19
  • 19
  • 3
    this will be a totally absurd thing to do let the compiler to the magic for you. But if you want https://doc.rust-lang.org/std/alloc/trait.Alloc.html. – Stargateur Dec 08 '19 at 07:24
  • 3
    Thanks for the link. However, I strongly disagree about absurdity. I think any "magic" is inherently unreliable, let alone that it requires knowledge of compiler internals. Moreover, when performance is critical, or resources are limited, all memory operations are important, and in this particular case it would be nice to be sure that no unnecessary operations are performed. – mrnateriver Dec 08 '19 at 08:30
  • I should also note that even though the provided link technically solves the problem, I'm pretty sure it's not the idiomatic way. It's just too low level. I could've called an allocator directly for that matter, probably, and get the raw pointer to the heap. I'm pretty sure there must be a language-level (or "stdlib-level") construct for that. – mrnateriver Dec 08 '19 at 08:46
  • 2
    An other "solution" is to use https://doc.rust-lang.org/std/boxed/struct.Box.html#method.new_uninit that new so I didn't think to it before, still low level in my opinion. But use this to "avoid using stack" is stupid. – Stargateur Dec 08 '19 at 10:19
  • @Stargatdoeur in debug the compiler will not do the magic for you and you will get a stack overflow. – Josu Goñi Aug 30 '22 at 11:55

5 Answers5

10

As of Rust 1.39, there seems to be only one way in stable to allocate memory on the heap directly - by using std::alloc::alloc (note that the docs state that it is expected to be deprecated). It's reasonably unsafe.

Example:

#[derive(Debug)]
struct Test {
    a: i64,
    b: &'static str,
}

fn main() {
    use std::alloc::{alloc, dealloc, Layout};

    unsafe {
        let layout = Layout::new::<Test>();
        let ptr = alloc(layout) as *mut Test;

        (*ptr).a = 42;
        (*ptr).b = "testing";

        let bx = Box::from_raw(ptr);

        println!("{:?}", bx);
    }
}

This approach is used in the unstable method Box::new_uninit.

It turns out there's even a crate for avoiding memcpy calls (among other things): copyless. This crate also uses an approach based on this.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
mrnateriver
  • 1,935
  • 1
  • 19
  • 19
  • 2
    Shouldn't this use [`std::ptr::write`](https://doc.rust-lang.org/std/ptr/fn.write.html) instead of assignment with `=` to not drop the old value at that location? – Optimistic Peach Jul 20 '20 at 18:34
  • 2
    This code causes undefined behavior when it attempts to assign the values for `a` and `b` as it constructs a reference to the fields `a` and `b`, but those have uninitialized junk in them. There's no way to do this correctly until [raw references](https://rust-lang.github.io/rfcs/2582-raw-reference-mir-operator.html) are stabilized. – Shepmaster Jul 20 '20 at 19:08
  • 1
    @OptimisticPeach that would be **better**, but it's not needed in _this specific case_ because `i64` and `&str` don't have a `Drop` implementation. – Shepmaster Jul 20 '20 at 19:09
  • @Shepmaster Whoops, sorry, I misread `&str` as `String`, hence my concern. – Optimistic Peach Jul 20 '20 at 19:20
9

You seem to be looking for the box_syntax feature, however as of Rust 1.39.0 it is not stable and only available with a nightly compiler. It also seems like this feature will not be stabilized any time soon, and might have a different design if it ever gets stabilized.

On a nightly compiler, you can write:

#![feature(box_syntax)]

struct Test { a: i32, b: i64 }

fn foo() -> Box<Test> {
    box Test { a: 123, b: 456 }
}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
mcarton
  • 27,633
  • 5
  • 85
  • 95
  • This is a good answer. However, I was asking about *stable* Rust, and as it turns out, there is a solution for it, albeit not as elegant as this syntax. Moreover, since this syntax is basically a compiler-internal feature (https://github.com/rust-lang/rust/issues/49733), I reckon stable solution will be such for a long time. – mrnateriver Dec 17 '19 at 07:18
  • The box syntax is perma unstable as of 2022-04-27: https://github.com/rust-lang/rust/issues/49733#issuecomment-1111283609 – Agost Biro Aug 02 '22 at 13:42
3

Is there a way to allocate directly to the heap without box?

No. If there was, it wouldn't need a language change.

People tend to avoid this by using the unstable syntax indirectly, such as by using one of the standard containers which, in turn, uses it internally.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
2

With Rust 1.51 and above one can use addr_of_mut to avoid intermediary references as in mrnaterivers answer, the Allocator is still unstable so the use of std::alloc::alloc remains.

#[derive(Debug)]
struct Test {
    a: i64,
    b: &'static str,
}

fn main() {
    use std::alloc::{alloc, Layout};
    use std::ptr::addr_of_mut;

    let layout = Layout::new::<Test>();

    let bx = unsafe {
        let ptr = alloc(layout) as *mut Test;

        addr_of_mut!((*ptr).a).write(42);
        addr_of_mut!((*ptr).b).write("testing");

        Box::from_raw(ptr)
    };
    println!("{:?}", bx);
}
cafce25
  • 15,907
  • 4
  • 25
  • 31
-1

I recently had the same problem. Based on the answers here and other places, I wrote a simple function for heap allocation:

pub fn unsafe_allocate<T>() -> Box<T> {
    let mut grid_box: Box<T>;
    unsafe {
        use std::alloc::{alloc, dealloc, Layout};
        let layout = Layout::new::<T>();
        let ptr = alloc(layout) as *mut T;
        grid_box = Box::from_raw(ptr);
    }
    return grid_box;
}

This will create a region in memory automatically sized after T and unsafely convince Rust that that memory region is an actual T value. The memory may contain arbitrary data; you should not assume all values are 0.

Example use:

let mut triangles: Box<[Triangle; 100000]> = unsafe_allocate::<[Triangle; 100000]>();
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
mousetail
  • 7,009
  • 4
  • 25
  • 45
  • This code is non-idiomatic and improperly uses `unsafe` code. This can trivially be used to introduce undefined behavior. – Shepmaster Jul 20 '20 at 15:03
  • But it's the only way? All the other answers say that using `alloc` is the way to go, and this seems to be the most safe way to use `alloc` – mousetail Jul 20 '20 at 15:05
  • Even if I'm late, I think what Shepmaster means is that this function can be used to [cause UB in safe Rust](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=084ec3472e1f4700e118c46d34d57b56), it is *unsound*, at the very least it has to be defined as an `unsafe` function. – cafce25 Mar 19 '23 at 16:11
  • @cafce25 This is literally what the libraries mentioned do internally – mousetail Mar 19 '23 at 18:51
  • I don't quite follow that argument, they have that code so this is somehow sound? It's not. – cafce25 Mar 20 '23 at 18:41
  • Marking the function as unsafe would probably be a improvement – mousetail Mar 20 '23 at 18:49