13

In the below example:

struct Foo {
    a: [u64; 100000],
}

fn foo(mut f: Foo) -> Foo {
    f.a[0] = 99999;
    f.a[1] = 99999;
    println!("{:?}", &mut f as *mut Foo);

    for i in 0..f.a[0] {
        f.a[i as usize] = 21444;
    }

    return f;
}
fn main(){
    let mut f = Foo {
        a:[0;100000]
    };

    println!("{:?}", &mut f as *mut Foo);
    f = foo(f);
    println!("{:?}", &mut f as *mut Foo);
}

I find that before and after passing into the function foo, the address of f is different. Why does Rust copy such a big struct everywhere but not actually move it (or achieve this optimization)?

I understand how stack memory works. But with the information provided by ownership in Rust, I think the copy can be avoided. The compiler unnecessarily copies the array twice. Can this be an optimization for the Rust compiler?

Boann
  • 48,794
  • 16
  • 117
  • 146
YangKeao
  • 184
  • 6
  • 2
    C++ does essentially the same. Huge data structures can be dumped on the stack without the `&` in the function or method declaration indicating that you want to pass a reference. (In my case that was a bug, 200K were dumped on a 16K stack in an embedded system. Since there was no memory protection several other stacks were also wiped out, and the system crashed shortly afterwards in unrelated code. Took me a few hours to find the single missing `&`.) – starblue Nov 25 '18 at 09:04
  • @starblue With or without `&` can have lot of difference. Pass reference of a variable will share the same memory. But without `&` the copy constructor ( or simply copy ) will be used to create the argument variable ( it has no connection with the variable passed in after copy) . But `move` in c++ is used when we use `&&` and `std::move` to trigger move constructor. After being `moved` into a function, the variable cannot be used. So the move in C++ has nearly the same semantic with rust (without security provided by ownership system), but the performance is different. – YangKeao Nov 25 '18 at 09:36
  • 1
    And how would you move an array in a C++ move constructor ? Use box for big thing. – Stargateur Nov 25 '18 at 19:39

1 Answers1

12

A move is a memcpy followed by treating the source as non-existent.

Your big array is on the stack. That's just the way Rust's memory model works: local variables are on the stack. Since the stack space of foo is going away when the function returns, there's nothing else the compiler can do except copy the memory to main's stack space.

In some cases, the compiler can rearrange things so that the move can be elided (source and destination are merged into one thing), but this is an optimization that cannot be relied on, especially for big things.

If you don't want to copy the huge array around, allocate it on the heap yourself, either via a Box<[u64]>, or simply by using Vec<u64>.

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157
  • 6
    Or pass the function an `&mut f` and return nothing, which would be idiomatic in this case. – starblue Nov 25 '18 at 08:57
  • 2
    In this particular case, the move could actually be avoided. The variable `f` is created in the stack frame of `main()`, and the compiler could statically determine that it's not necessary to move it into the stack frame of `foo()`, since it will be copied back to its original location anyway. But even in a release build with `foo()` marked as `#[inline(always)]`, the compiler still unnecessarily copies the array twice. – Sven Marnach Nov 25 '18 at 09:09
  • 1
    I understand how it works. But the ownership provide more information for compiler, with these information we can use same part of memory without any secure problem . I think it's a big optimization in some case. But rust haven't done this ( but actually when the function is inlined, it will use the same memory without copy ). – YangKeao Nov 25 '18 at 09:24
  • @SvenMarnach I think the ownership moving ensures the copy can be avoided. Is there any counterexample? – YangKeao Nov 25 '18 at 09:39
  • 1
    @YangKeao Even when inlining the function, [the copy does not seem to get elided](https://play.rust-lang.org/?version=stable&mode=release&edition=2015&gist=a8f90c085c74cbf7a2225af50ed8e6fe), but maybe that's because of taking the addresses, and wouldn't happen if we remove the `println` invocations? My understanding so far has been that Rust should be able to optimise that case. – Sven Marnach Nov 25 '18 at 12:22
  • @SvenMarnach After removing `println!` in `foo`, I check the `llvm-ir` and find no copy. But not only taking address, some operation on `f` such as `println!("{}",f.a[0])` will make it copy. The weird thing is that adding `f.a[0]=10000` (and remove print) in `foo`, it will also not copy. So I don't know when return by copy will happen :( – YangKeao Nov 25 '18 at 12:49
  • 2
    @YangKeao I guess these kinds of optimisations aren't meant as guarantees. If you want to guarantee that no copy is happening, pass around references of boxes instead. – Sven Marnach Nov 25 '18 at 12:59
  • If move copy or not depends if it's a stack or heap allocation, no? – Olle Härstedt Mar 27 '21 at 16:16
  • @OlleHärstedt No, it copies by default, and then the compiler tries to optimize the copy away. – Sebastian Redl Mar 27 '21 at 20:42