0

The Rust reference states:

  • The following is a list of behavior which is forbidden in all Rust code, including within unsafe blocks and unsafe functions:
    • Dereferencing a null/dangling raw pointer

This question is solely about the null part. There's no inherent reason to require that a definite but unknown address in an address space be made inaccessible. That's my thesis (it's what most implementations of the null pointer do), so why is Rust following in these footsteps since it seems merely ancient C cruft?

I've heard several stories (example, another, another) in my career where there was a need to access such a pointer, so why allow the spec (and hence, implementations) to get in the way again?

There's assembly output and a lot of context in the C++ Reddit thread from which this question stems. It was also brought up in this Rust Reddit thread.

Despite the many "war stories" referred above, what is really upsetting for me is not in that realm, but more on the abstract one: making address space access (which is delivered by the hardware) non uniform from a language's specification, a priori to all hardware/OS/architectures it may ever be used for.

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
oblitum
  • 11,380
  • 6
  • 54
  • 120
  • 2
    Note that (even in C), a null pointer is not necessary a pointer whose value is 0. If the address 0 was a valid value for a pointer on some platform, it could use any other placeholder for its null pointer. – mcarton Jan 13 '17 at 22:36
  • 2
    @mcarton I try hard to avoid this misconception when putting up the question, and fail, because each time someone comes to point it. And the endless discussion starts over null pointer not being the value 0. Sorry but the question is not about it, it tries to avoid this interpretation. – oblitum Jan 13 '17 at 22:39
  • You've edited your question to add “a priori to all hardware/OS/architectures it may ever be used for” but that's the point of the null pointer: it is not set a priori for *all* hardware/OS/architectures. It is set a priori for *a* hardware/OS/architecture, but the value can be any value that does not make sense to dereference on that hardware/OS/architecture. – mcarton Jan 13 '17 at 23:40
  • @mcarton I edited it in several places trying to avoid that line of thought further. The question is not about which value, whatever it is, the issue is on why require any, for that. – oblitum Jan 13 '17 at 23:43
  • @mcarton the address space is set irregular upfront on the language spec (for which reason this question is being asked). It cares that any other address except one (null, whatever it is) can be accessed without limitations imposed by the language (and for consequence the compiler). The platform detail that there will always be an address that can be used for this task is just an assumption, and as such, why have this assumption (it seems completely disposable)? – oblitum Jan 13 '17 at 23:57
  • Regardless of the concrete representation of `NULL` (zero, -1, etc.), the point of a null pointer is that, by definition, it *doesn't point at a valid value*. Therefore, dereferencing a null pointer **has to be undefined behavior** in Rust. Your particular issue appears to be that the Rust implementation has chosen to equate the value zero and the null pointer, preventing you from dereferencing the pointer of value zero, but that's not really a question and it's more of a bug report, if anything. – Shepmaster Jan 16 '17 at 14:59
  • @Shepmaster You should read the comments above and then the given references. What you're saying is negating what's just above in the comments. – oblitum Jan 16 '17 at 15:09
  • @pepper_chico don't worry, I read the above. You appear to assume that a null pointer **has the same representation** as a normal pointer (thus causing it to fall in the same address space), but I'd bet that the reference makes no such assumption. Only the *implementation* makes that true. The **reference** doesn't say that dereferencing the address `0` or `-1` is undefined behavior, nor does it say that `std::ptr::null()` has the value `0` or `-1`. The *implementation* makes the equivalence. Regardless of the concrete value of the null pointer, dereferencing *the null pointer* **must** be UB. – Shepmaster Jan 16 '17 at 15:16
  • @Shepmaster that's why I use the word "basically" in one of the links inside one of the references given, specifically: http://nosubstance.me/post/dereferencing-null-pointers/. – oblitum Jan 16 '17 at 15:23
  • @Shepmaster if you have the rationale for why there must be a pointer that doesn't point to any value and why this is more important than have uniform access to the address space, and the references to back it , I would just be happy with an answer, that's why the question is for. – oblitum Jan 16 '17 at 15:27
  • *and why this is more important than have uniform access to the address space* — again, you are assuming that a null pointer's representation **must** lie in the address space, and nothing in the reference currently forces that to be true. It's the implementation that chooses to do that, and the existing answer already answers why that is immensely beneficial: it cuts down memory requirements for many usual types. If you cannot understand the benefit of a pointer that doesn't yet point at valid data, I'm not sure what I can say to convince you that it is useful. – Shepmaster Jan 16 '17 at 15:31
  • @Shepmaster I'm not assuming it must, I'm assuming it will, because implementantions are not left with a better implementation choice, afaik. And it seems you're not reading, if you have the rationale, technical and theoretical backing and references, you can just provide an answer, because that's why the site is for. – oblitum Jan 16 '17 at 15:38
  • Double check your statement, as it contains a conditional: *if I had references*. I don't, so I cannot answer. I'm simply pointing out logical issues with your question as phrased and adding in my own common sense explanations. – Shepmaster Jan 16 '17 at 15:40
  • @Shepmaster OK. I hope what I have provided helped to make it more clear. Thanks anyway. – oblitum Jan 16 '17 at 15:43
  • @Shepmaster [this reasoning](http://stackoverflow.com/questions/41643335/why-is-dereferencing-a-null-raw-pointer-undefined-behaviour#comment70491999_41646116) and [this comment](https://np.reddit.com/r/rust/comments/5nr9jt/a_personal_tale_on_a_special_value_rcpp/dceiwg7/) by known Rust dev may serve as counter argumentation. Just want to share, it's not that I want to continue discussing. – oblitum Jan 16 '17 at 15:47

1 Answers1

4

The null pointer is special-cased elsewhere in the language already. For example, Option<Box<T>> (where T: Sized) will use only one word, not two, because a null pointer is used to represent None. Disallowing code that follows null pointers is consistent with this idea.

In a broader sense, Rust has not given as much attention to supporting exotic architectures as C. This is not out of malice, but merely a matter of priorities during its design. The language was built for a modern web browser, after all – an application which runs in user mode on x86 or ARM. That's not the kind of use case where these issues would come up. Perhaps if someone brought it up pre-1.0 it could have gone differently.

Lambda Fairy
  • 13,814
  • 7
  • 42
  • 68
  • 1
    Not wanting to argue about it but, I agree it looks consistent, but still just that. There's no reasoning on how that could affect unsafe raw address space access, if one wished so. I mean, I'm thinking in a context where the implementation would itself rely on such null to construct and provide its features, using it as a tool, but still not disallowing, nor even care, what address is going to be accessed in unsafe mode. On the platforms, yes indeed I agree too, but you can't stop progress :) and now I'm seeing Rust popping up in kernel/drivers space, unikernels, etc. – oblitum Jan 14 '17 at 02:29