12

My understanding is that the following code has undefined behaviour in C++ due to something called "strict aliasing rule".

#include <cstdint>

enum Foo : int16_t {};

void test(Foo& foo) {
    reinterpret_cast<int16_t&>(foo) = 42;
}

In particular, a C++ compiler may omit the assignment altogether because it is allowed to assume that the int16_t& returned by the reinterpret_cast doesn't point to the same memory as foo (of type Foo&) because their types are different.

My question is, does Rust have something akin to the C++ "strict aliasing rule"? In other words, does the following, equivalent Rust code have undefined behaviour?

#[repr(i16)]
enum Foo { Dummy }

unsafe fn test(foo: &mut Foo) {
    *std::mem::transmute::<&mut Foo, &mut i16>(foo) = 42;
}

EDIT:
There's an unrelated issue with the above Rust example code in that test creates a non-existing enum variant. So, here's the same code with a quick fix:

#[repr(i16)]
enum Foo { Dummy, Yummy = 42 }

unsafe fn test(foo: &mut Foo) {
    *std::mem::transmute::<&mut Foo, &mut i16>(foo) = 42;
}
kmky
  • 783
  • 6
  • 17
  • As far as I understand it: C/C++ essentially say "writing memory as one type and reading it back as a different type is never okay, except if you are using a `union`, or reading as `char`". Rust does not have such a blanket rule, and the legality depends on the particular pair of types in question. – glaebhoerl Oct 04 '14 at 12:37

1 Answers1

8

My understanding is that general Rust code (as verified by the compiler) of course contains no undefined behavior (barring compiler bugs), and that Rust-the-language does not define any undefined behavior, unlike C.

Naturally, then, the only place undefined behavior could occur is within unsafe blocks or functions. unsafe blocks are meant to encapsulate any potentially dangerous behavior as being safe as a whole. A post from July 2014 mentions that Rust's compiler requires that certain invariants are met, and that it is generally dangerous to break those invariants, even in unsafe blocks (in fact, it is only possible to break then within unsafe blocks).

One of these dangerous behaviors is pointer aliasing (which seems to be defined in LLVM itself). Interestingly, however, the LLVM docs say (emphasis mine):

Consequently, type-based alias analysis, aka TBAA, aka -fstrict-aliasing, is not applicable to general unadorned LLVM IR

Thus, it seems that generally, as long as none of the other unsafe behaviors are triggered as a result of the unsafe block, strict aliasing isn't an issue in Rust.


That being said, your specific example is possibly dangerous, because it seems to match one of the unsafe behaviors from the reference:

Invalid values in primitive types, even in private fields/locals:

  • A discriminant in an enum not included in the type definition

I wrote a small extension to your example that shows the binary representations of the enum variants, before and after the unsafe operation: http://is.gd/x0K9kN

As you can see, assigning 42 to the enum value does not match any of the defined discriminator values. If, however, you had assigned either 0 or 1 to the enum value (or instead had defined explicit discriminators for the enum variants), then the operation should theoretically be fine.

voithos
  • 68,482
  • 12
  • 101
  • 116
  • 2
    "If, however, you had assigned either 0 or 1 to the enum value (or instead had defined explicit discriminators for the enum variants), then the operation should theoretically be fine." Enum layout is actually undefined, so, without explicit descriminants, even assigning 0 or 1 is possibly undefined behaviour (that is, the compiler is allowed to choose to not use 0 and 1). – huon Oct 03 '14 at 07:58
  • @dbaupp: The current docs seem to be a bit sparse on this - I couldn't find anywhere that explicitly says that C-like enum layout is undefined (C-like enums are already a special case, anyway). I did find an older article, however, which [states that automatic discriminants start with 0](https://gist.github.com/brson/9dec4195a88066fa42e6#enumerations), but it doesn't seem to be part of the current docs anymore. – voithos Oct 03 '14 at 18:36
  • "My understanding is that general Rust code [...] of course contains no undefined behavior [...], and that Rust-the-language does not define any undefined behavior, unlike C." True for safe Rust, but unsafe Rust has all the same issues as C and needs a definition of UB the same way C does. That definition is [WIP](https://internals.rust-lang.org/t/proposal-reboot-the-unsafe-code-guidelines-team-as-a-working-group/7307). But yes, Rust does not have type-based strict aliasing the way C(++) does. It has aliasing based on `&mut` and `&`, but that should not be a problem here. – Ralf Jung May 09 '18 at 16:46