5

Consider this code:

enum MyError {
    BadError, NotSoBad,
}

#[inline(never)]
fn do_stuff1<T>(m: Result<T, MyError>) -> bool {
    m.is_err()
}

#[inline(never)]
fn do_stuff2<T>(m: Result<T, MyError>) -> bool {
    if let Err(MyError::BadError) = m {
        true
    } else {
        false
    }
}

The resulting assembly (see playground):

playground::do_stuff1:
    cmpb    $1, %dil
    sete    %al
    retq

playground::do_stuff2:
    testb   %dil, %dil
    setne   %cl
    testb   %sil, %sil
    sete    %al
    andb    %cl, %al
    retq

Testing result.is_err() results in just one byte comparison, which seems reasonable but pattern-matching for the specific error results in a few more instructions.

This is because the error type is actually encoded as a separate byte from the error flag. A Result<u64, MyError> is laid out like this:

    .--- is error?
   /
  /  .--- error variant 
 /  /
01 01 XX XX XX XX XX XX 00 00 00 00 00 00 00 01
      \               / \                     /
       `-- padding --'   `---- data (u64) ---'

My expectation was that a single byte would be used to encode both the fact that there is an error and which variant the error is, which would allow for simpler assembly.

Is there a reason why this optimisation isn't possible, or are there some tradeoffs that would make it undesirable?

Peter Hall
  • 53,120
  • 14
  • 139
  • 204
  • 1
    Honestly, I honestly think it's just oversight/a lot of work to implement correctly. Rust recently got `NonNull` (and `NonZero`) to minimize the storage required to optimize `Optional` so it's as efficient as a raw pointer. Unlike Optional, this only works for certain data types (NonNull or unsaturated enumerations), so it likely hasn't been a priority yet. (This is a guess, but...). – Alex Huszagh Aug 29 '19 at 14:39
  • 2
    It's just not clever enough (yet). Currently the [code](https://github.com/rust-lang/rust/blob/master/src/librustc/ty/layout.rs#L921) basically falls back to normal tagged union layout whenever more than one `enum` variant is non-zero-sized, even if one of them has a niche that could replace the tag. `Result` has two dataful variants, so the compiler just gives up. – trent Aug 29 '19 at 15:16
  • If you were to use `Result, BadError>` you would find the compiler happy to optimize it -- it handles nesting just fine. I tried to demonstrate this in the playground but was stymied by the optimizer deleting and deduplicating most of my code, even with `inline(never)`. – trent Aug 29 '19 at 15:33

0 Answers0