7

I am trying to understand the problems with duplicated code for & and &mut in getter-type functions. I'm trying to understand whether a particular solution to this problem using a cast inside an unsafe block would be safe.

The following is an example of the problem. It is taken from the really nice tutorial Learning Rust With Entirely Too Many Linked Lists.

type Link<T> = Option<Box<Node<T>>>;

pub struct List<T> {
    head: Link<T>,
}

struct Node<T> {
    elem: T,
    next: Link<T>,
}

impl<T> List<T> {
    // Other methods left out...

    // The implementation of peek is simple, but still long enough
    // that you'd like to avoid duplicating it if that is possible.
    // Some other getter-type functions could be much more complex
    // so that you'd want to avoid duplication even more.
    pub fn peek(&self) -> Option<&T> {
        self.head.as_ref().map(|node| {
            &node.elem
        })
    }

    // Exact duplicate of `peek`, except for the types
    pub fn peek_mut(&mut self) -> Option<&mut T> {
        self.head.as_mut().map(|node| {
            &mut node.elem
        })
    }
}

The solution

It seems to me that you can use a cast in an unsafe block to solve this problem. The solution seems to have the following properties:

  • It can be done in a safe way.
  • It provides a safe interface for an unsafe implementation.
  • The implementation is simple.
  • It removes the code duplication.

The following is the solution:

// Implemention of peek_mut by casting return value of `peek()`
pub fn peek_mut(&mut self) -> Option<&mut T> {
    unsafe {
        std::mem::transmute(self.peek())
    }
}

These are the arguments for why I think it seems safe:

  1. The return value of peek() is from a known source with a known alias situation.
  2. Since the parameter type is &mut self there are no refs to its elements.
  3. Hence the return value of peek() in unaliased.
  4. The return value of peek() does not escape this function body.
  5. Casting an unaliased & to &mut doesn't seem to violate the pointer aliasing rules.
  6. The lifetimes of the target and source of the cast match each other.

Misc notes

  • The same problem is discussed the following question: How to avoid writing duplicate accessor functions for mutable and immutable references in Rust?

    This question is different because it asks about details of one specific solution to the problem.

  • There are other kinds of solutions to the problem, such as this one: Abstracting over mutability in Rust

    But all other solutions seem to introduce quite a bit of extra complexity to the code.

  • The Nomicon asserts strongly that it is always undefined behaviour to cast & to &mut, which implies that this solution is not safe. But it makes no attempt whatsoever to explain why.

  • The Rust Reference states that "Breaking the pointer aliasing rules" is UB. To me it seems like this solution doesn't do that, for reasons given above.


Questions

I have the following questions to people with deeper Rust knowledge than myself:

  • Is the provided implementation of peek_mut safe?
  • Are there other necessary arguments that are needed to establish that it is safe, that I have missed?
  • If it is indeed not safe, why is that? Can you give a detailed explanation?
  • Is there in that case a similar solution to the problem that is safe?
Lii
  • 11,553
  • 8
  • 64
  • 88
  • 3
    For whatever it’s worth, even if this turns out to be safe, I wouldn’t do it. It trades a small amount of repetition for code that you have to think about when reading and an unsafe that you have to think about whenever making related changes or auditing unsafes. – Ry- Jan 04 '20 at 21:19
  • @Ry-: That's a good point... But if the getters are only little bit more complex and if there are many of them then it becomes a real pain to maintain duplicate versions... Anyway, it would be interesting to know if it is safe or not! – Lii Jan 04 '20 at 21:25
  • 1
    You could, for example, create a private `peek_raw` function that returns a raw pointer, and use that to write `peek` and `peek_mut` -- I believe this is a way to avoid the unsoundness of transmuting from `&T` to `&mut T`, since raw pointers may alias anything. However, the code it generates may not be as good as simply writing the "same" (or nearly-the-same) code twice. – trent Jan 04 '20 at 22:49
  • @trentcl: Ah, that sounds like an okay alternative, thanks! Another disadvantage apart from performance of that solution is that the implementation of `peek_raw` would be easier to get work that `peek` based on `&`, since it will not have normal Rust safety checks. – Lii Jan 04 '20 at 22:54
  • True, the safety would be enforced by `peek` and `peek_mut`, but that is essentially the same as you suggest doing with `transmute`, in my opinion. – trent Jan 04 '20 at 23:14
  • I encourage you to read some [Rust koans](https://users.rust-lang.org/t/rust-koans/2408). I think that both "Obstacles" and "Puom" are relevant to this question. – trent Jan 04 '20 at 23:49

1 Answers1

7

I believe this code will invoke undefined behaviour. To quote the Nomicon:

  • Transmuting an & to &mut is UB
    • Transmuting an & to &mut is always UB
    • No you can't do it
    • No you're not special

More to the point, the Rust compiler will mark the return value of peek() as immutable in the LLVM intermediate representation, and LLVM is free to make optimisations based on this assertion. It might not currently happen in this specific case, but I still think it's undefined behaviour. If you want to avoid the duplication at all costs, you can use a macro.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 1
    Yeah, I saw that section in Nomicon. It annoys me that it doesn't explain more! Can you provide any source for the fact that `&` return values are marked as "immutable"? And that this is UD to cast away, in all cases? Is there anything about this in the Rust or LLVM reference docs? – Lii Jan 04 '20 at 23:12
  • 1
    @Lii: “Transmuting an & to &mut is *always* UB” is a pretty good source for “in all cases”, no? – Ry- Jan 05 '20 at 00:22
  • I'd like to know more to understand why it is UB if it is so. You hinted on an explanation by saying "compiler will mark the return value of peek as immutable", and I'd like to find more details about that. Do you know any more details on why this would be UD, or do you where I could find more such details? – Lii Jan 05 '20 at 00:39
  • The Rust Reference is not that definitive, saying "Breaking the pointer aliasing rules" causes UB, and linking to the LLVM ref. Maybe it is possible to infer from the LLVM aliasing rules that this is UB? But to do that it seems like you need a bit deeper knowledge that what I have. – Lii Jan 05 '20 at 00:42
  • Also, the style of writing in the Nomicon makes me suspicious: Such strong and categorical formulations without any facts or reasoning or explanation correlates negatively with actual correctness in my experience. – Lii Jan 05 '20 at 00:47
  • @Lii I'd consider the Nomicon authoritive enough on this topic, but if you need more authoritive sources, the language reference says [this](https://doc.rust-lang.org/reference/behavior-considered-undefined.html): "*Behavior considered undefined: [...] **Moreover, all data reached through a shared reference or data owned by an immutable binding is immutable**, unless that data is contained within an `UnsafeCell`.*" – Frxstrem Jan 05 '20 at 09:35
  • It's not that the Nomicon is not authoritative enough, but that it doesn't give any details and explanation... It would be nice to understand more. – Lii Jan 05 '20 at 09:52
  • The statements in The Rust Reference are interesting! My reasoning for why the construct in the solution might be safe goes like this: The Rust Reference start the line that you site like this: "*Mutating immutable data [is UD].*". That is, it's not the act of constructing a ref to immutable data that is UD, but the act of modifying the data. And, when the return value of `peek()` is gone then so it the last shared reference. At that point the data is mutable again, and it should be safe to return a mutable reference to it. – Lii Jan 05 '20 at 10:00
  • @Lii You are correct that the quote implies that it is not necessarily the construction of the mutable reference that is UB, but mutating the data. But it also says that "all data **reach through a shared reference**", implying that it's UB even if the shared reference is gone. The standard library page for [UnsafeCell](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html) also says that "*In general, transmuting an &T type into an &mut T is considered undefined behavior.*" and "*`UnsafeCell` is the only core language feature to work around the restriction that &T may not be mutated.*" – Frxstrem Jan 05 '20 at 11:07
  • 5
    @Lii That being said, it's undefined behavior "because Rust says so". It's not UB because the compiler necessarily does something magical, but it *allows* the compiler to make optimizations based on the assumption that all well-formed programs have no UB. The categorization of UB is very broad because it restricts the number of well-formed programs and makes it easier for the compiler to optimize it, but it doesn't necessarily mean that every case that is undefined behavior has some compiler optimizations behind it. It just means that the compiler is free to do that. – Frxstrem Jan 05 '20 at 11:12