6

I think I understand, at a very high level what the difference between & and * in Rust is as it pertains to memory management.

What is the difference between the following code snippets. Are there dangers to applying one approach versus the other?

for (i, item) in bytes.iter().enumerate() {
    if *item == b' ' {
        return i;
    }
}
for (i, &item) in bytes.iter().enumerate() {
    if item == b' ' {
        return i;
    }
}
for (i, item) in bytes.iter().enumerate() {
    if item == &b' ' {
        return i;
    }
}

As I understand it, when I return a value from iter() I am returning a reference to the element found in bytes. If I want to make a comparison on the item, I need to either compare between two references &u8 or I need to make &item a reference itself so that when I call item it is of type u8, or I need to dereference item when I compare it so that item = &u8 -> *item = u8.

  1. When I run the code using (i, &item), when I call item later on, is this exactly the same thing as dereferencing in the second example, or is there a fundamental difference about how the compiler is interpreting the first code snippet and the second code snippet?

  2. Is there anything wrong with the third code snippet? I realize this is a bit of an opinion based question. I realize that if I were to assign a value to another variable using item (or *item, or assigning the value as a reference) I would have different datatypes being returned later on. Aside from managing your data types, is there anything else to keep in mind when considering if item == &b' ' is the right tool for the job?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Matt
  • 265
  • 2
  • 3
  • 10

1 Answers1

7

There is no difference, whatsoever, between these snippets. They generate the exact same assembly:

pub fn a(bytes: &[u8]) -> usize {
    for (i, item) in bytes.iter().enumerate() {
        if *item == b' ' {
            return i;
        }
    }
    0
}

pub fn b(bytes: &[u8]) -> usize {
    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return i;
        }
    }
    0
}

pub fn c(bytes: &[u8]) -> usize {
    for (i, item) in bytes.iter().enumerate() {
        if item == &b' ' {
            return i;
        }
    }
    0
}
playground::a:
    negq    %rsi
    movq    $-1, %rax

.LBB0_1:
    leaq    (%rsi,%rax), %rcx
    cmpq    $-1, %rcx
    je  .LBB0_2
    cmpb    $32, 1(%rdi,%rax)
    leaq    1(%rax), %rax
    jne .LBB0_1
    retq

.LBB0_2:
    xorl    %eax, %eax
    retq

; The code is identical so the functions are aliased
.set playground::b, playground::a
.set playground::c, playground::a

For what it's worth, I'd write the function as

pub fn a(bytes: &[u8]) -> Option<usize> {
    bytes.iter().position(|&b| b == b' ')
}

iter() [...] a reference to the element found in bytes

Yes, iter is typically a function that returns an iterator of references.

I need to either compare between

Generally, you need to compare between two things with the same amount of references or sometimes one level of reference difference. How you achieve this is immaterial — referencing a value or dereferencing another, or dereferencing via * as an expression or via & in a pattern.

See also:

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366