3

I have a little confuse about the move semantics in rust after I write some code and read some articles, I thought after the value moved, it should be freed, the memory should be invalided. So I try to write some code to testify.

the first example

#[derive(Debug)]
struct Hello {
    field: u64,
    field_ptr: *const u64,
}

impl Hello {
    fn new() -> Self {
        let h = Hello {
            field: 100,
            field_ptr: std::ptr::null(),
        };
        h
    }

    fn init(&mut self) {
        self.field_ptr = &self.field as *const u64;
    }
}
fn main(){
    let mut h = Hello::new();
    h.init();
    println!("=================");
    println!("addr of h: ({:?}) \naddr of field ({:?})\nfield_ptr: ({:?}) \nptr value {:?}", &h as *const Hello, &h.field as *const u64, h.field_ptr, unsafe {*h.field_ptr});

    let c = &h.field as *const u64;
    let e = &h as *const Hello;
    let a = h;
    let d = &a.field as *const u64;

    println!("=================");
    println!("addr of a: ({:?}) \naddr of field ({:?})\nfield_ptr: ({:?}) \nptr value {:?}", &a as *const Hello, &a.field as *const u64, a.field_ptr, unsafe {*a.field_ptr});
    println!("=================");
    println!("addr of c {:?}\nvalue {:?}", c, unsafe {*c});
    println!("addr of d {:?}\nvalue {:?}", d, unsafe {*d});
    println!("addr of e {:?}\nvalue {:?}", e, unsafe {&*e});
}

the result of code above is

=================
addr of h: (0x7ffee9700628) 
addr of field (0x7ffee9700628)
field_ptr: (0x7ffee9700628) 
ptr value 100
=================
addr of a: (0x7ffee9700720) 
addr of field (0x7ffee9700720)
field_ptr: (0x7ffee9700628) 
ptr value 100
=================
addr of c 0x7ffee9700628
value 100
addr of d 0x7ffee9700720
value 100
addr of e 0x7ffee9700628
value Hello { field: 100, field_ptr: 0x7ffee9700628 }

so, I create a self reference struct Hello and make field_ptr point to the u64 field, and use a raw point to save the address of the struct and the address of field, and I move h to a to invalide the h variable, but I can still get the value of original variable which IMO should not exists through raw point?

the second example

struct Boxed {
    field: u64,
}
fn main(){

   let mut f = std::ptr::null();
    {
        let boxed = Box::new(Boxed{field: 123});
        f = &boxed.field as *const u64;
    }
    println!("addr of f {:?}\nvalue {:?}", f, unsafe {&*f});
}

the result

addr of f 0x7fc1f8c05d30
value 123

I create a boxed value and drop it after use a raw point save it's address, and I can still read the value of it's field through the raw point.

So my confuse is

  1. does move in rust actually a memcpy? and the original variable is just "hide" by the compiler?
  2. when does rust actually free the memory of variable on the heap? (the second example)

thanks

what I have read How does Rust provide move semantics?

Sean
  • 2,990
  • 1
  • 21
  • 31
  • 4
    See this [hotel analogy answer](https://stackoverflow.com/questions/6441218/can-a-local-variables-memory-be-accessed-outside-its-scope/6445794#6445794) for why accessing data after its been deallocated can still look valid. Running your second example under Miri catches this use-after-free bug, though I'm a bit dissappointed it doesn't catch the first case. – kmdreko Sep 26 '21 at 15:58
  • 1
    #1 yes, move is a memcpy accompanied by ownership transfer - i.e. the old owner cannot access the value, and the new owner is responsible for freeing it. of course, depending on optimizations, the copying might be eliminated. #2 the memory is freed when the value goes out of scope and `Drop::drop()` is invoked. This doesn't mean that the memory is cleaned up, it just means that it's marked free for future allocations (or, more rarely, returned to the OS for use by other programs). – user4815162342 Sep 26 '21 at 19:21
  • @user4815162342 thanks for the answer, that answer my question. – Sean Sep 27 '21 at 05:13

1 Answers1

4

So the first block of your output should be clear, right? The address of the struct is just the first bit of memory for where that struct sits in memory, which is the same as the address of its first field.

Now for your second block. You're grabbing some raw pointers into the struct, and then you're moving the struct, via let a = h.

What that does is: On the stack we now have a new variable a, a memory copy of what the old stack layout for variable h was. That's why both a and a.field have a new address. The raw pointer, of course, still points to the old h.field address, and that's why you can still access that data.

Note though that you can only do that via the unsafe block, because what you do is unsafe. There's no guarantee that whatever your field pointer points to will remain valid.

If you remove all use of unsafe constructs, there will be no way to access a.field via h.field.

Same idea applies to the second example. You couldn't get to the dropped stuff if you weren't using raw pointers and unsafe blocks, and that's because this code is very suspicious. In your simple example, it still works because Rust doesn't just go ahead and scramble the memory of values that have been dropped. Unless something else in your program repurposes that memory, it will stay how you left it.

cadolphs
  • 9,014
  • 1
  • 24
  • 41
  • "In your simple example, it still works because Rust doesn't just go ahead and scramble the memory of values that have been dropped." I think that's more of an LLVM thing: technically it should be able to take temporal changes in account when allocating stack frames (and so reuse existing stack slots if their content is not valid anymore), but practically either it doesn't bother *or* Rust doesn't mark variables as dead so llvm has to assume they're all live until end of frame. – Masklinn Sep 27 '21 at 09:56