5

Undefined behaviour in C++ can be really hard to debug. Is there a version of C++ and standard library which does not contain any undefined behaviour but rather throws exceptions? I understand that this will be a performance killer, but I only intend to use this version when I am programming, debugging and compiling in debug mode and don't really care about performance. Ideally this version would be portable and you would be able to easily switch on/off the undefined behaviour checks.

For example, you could implement a safe pointer class like so (only check for null pointer, not actually if it points to a valid block of memory):

template <typename T>
class MySafePointer {
     T* value;
public:
      auto operator-> () {
          #ifndef DEBUG_MODE
          assert(value && "Trying to dereference a null pointer");          
          #endif
          return value;
      }
      /* Other Stuff*/

};

Here the user only needs to #undef DEBUG_MODE if you want to get your performance back.

Is there a library / safe version of C++ which does this?

EDIT: Changed the code above so that it actually makes more sense and doesn't throw an exception but asserts value is non-null. The question is simply a matter of having a descriptive error message vs a crash...

SomeProgrammer
  • 1,134
  • 1
  • 6
  • 12
  • 13
    There are compiler plugins, like `-fsanitize=undefined`. But some undefined behaviors are just _impossible_ to detect. – KamilCuk Mar 26 '21 at 11:45
  • 3
    Program in constant expression, UB are disallowed there. (but also some legal runtime construct). – Jarod42 Mar 26 '21 at 11:46
  • @KamilCuk why are some undefined behaviours impossible to detect? – SomeProgrammer Mar 26 '21 at 11:48
  • Your "safe" pointer failed for wrong non null ptr. (dangling pointer). – Jarod42 Mar 26 '21 at 11:48
  • 3
    @SomeProgrammer: [Halting problem](https://en.wikipedia.org/wiki/Halting_problem) for example. (infinite loop without observable behavior is UB). – Jarod42 Mar 26 '21 at 11:50
  • `gsl::non_null`. related: https://stackoverflow.com/questions/33306553/gslnot-nullt-vs-stdreference-wrappert-vs-t. Asking for libraries is unfortunately offtopic – 463035818_is_not_an_ai Mar 26 '21 at 11:57
  • 1
    @Jarod42: But obviously an implementation _can_ choose to implement an infinite loop as an infinite loop. – MSalters Mar 26 '21 at 11:58
  • @largest_prime_is_463035818 this was an example to show how pointers could be checked for null, not if they were pointing to a valid block of memory. I suppose the latter could be checked by using a boolean flag or something... – SomeProgrammer Mar 26 '21 at 11:59
  • 1
    well, no, it isnt trivial to make a raw pointer keep track of whether the pointed to object is alive. You need reference counting or similar and then you are just reinventing smart pointers – 463035818_is_not_an_ai Mar 26 '21 at 12:00
  • @largest_prime_is_463035818 I know it's non trivial, that's why I haven't done it myself yet. I'm asking if I can use c++ without undefined behaviour (and with a performance decrease). And please do not tell me this is impossible, plenty of other languages do not have undefined behaviour. – SomeProgrammer Mar 26 '21 at 12:02
  • 1
    Depending on how far you need to go on the safety front, the best option is to change languages. C# and rust come to mind – Jeffrey Mar 26 '21 at 12:03
  • the core guideline support library (gsl) tries to offer facilities that let you write "safer C++" – 463035818_is_not_an_ai Mar 26 '21 at 12:03
  • @largest_prime_is_463035818: I think C++20 should do the check for the "constexpr new"... – Jarod42 Mar 26 '21 at 12:03
  • I am not trying to tell you that it is impossible. Sorry thats a misunderstanding. I am trying to tell you that it is possible. Look at `gsl::non_null` or the `std::` smart pointers – 463035818_is_not_an_ai Mar 26 '21 at 12:04
  • I guess that if you managed to make a library that checks for most kinds of undefined behaviors, the impact in performance would probably be so great as to make the library unusable in real applications. – Ion Larrañaga Mar 26 '21 at 12:05
  • @Jeffrey I don't want to change languages. I just want to write c++ in debug mode which reports to me runtime errors / undefined behaviour when I program and debug my code. – SomeProgrammer Mar 26 '21 at 12:05
  • @ Ion Larrañaga I would not actually be using the library when shipping the product, just when programming/debugging – SomeProgrammer Mar 26 '21 at 12:05
  • @largest_prime_is_463035818 Yes sorry I was replying to other comments. The problem with std::shared_ptr is that it alters the behaviour of the program, what I want is a pointer which throws an exception instead of crashing / undefined behaviour – SomeProgrammer Mar 26 '21 at 12:08
  • I understand but even then... there are bugs in release builds that are not reproduced in debug builds because of the differences between them.... and the differences between them are not as great as what you propose. – Ion Larrañaga Mar 26 '21 at 12:09
  • 1
    @SomeProgrammer If you just want to debug possible undefined behaviour, use the [undefined behavior sanitizer option](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html) of your compiler. It might not catch every possible case of UB though. – G. Sliepen Mar 26 '21 at 12:10
  • @ G. Sliepen yes that seems like the best option right now. – SomeProgrammer Mar 26 '21 at 12:10
  • 1
    @MSalters: An infinite loop as infinite loop is UB. so it would not be a "safe" version. and throwing exception is not possible as it is not detectable in all cases. Compiler might define its behavior for each kind of UBs, not sure it is even possible. on top of that, there are also programs ill-formed NDR :-) – Jarod42 Mar 26 '21 at 12:13
  • `gsl::not_null` does throw an exception. Its just not exactly what you want, because afaik the exception is not thrown when dereferencing, but before. – 463035818_is_not_an_ai Mar 26 '21 at 12:14
  • *"plenty of other languages do not have undefined behaviour"* -- this is an invalid argument. C++ is not one of those other languages. If all languages were equivalent, we would not need so many. As one example: how many of those "other languages" avoid certain undefined behaviours by not providing direct access to pointers? (They, like Java, might use pointers, but not as a distinct type.) That's an unfair comparison, and those languages should be removed from the comparison. And so on. *(BTW, do you consider Visual Basic to be a safe language? Once I managed to trash the stack in VB.)* – JaMiT Mar 26 '21 at 13:04
  • Considering what you're asking for, you should take a look at the rust-lang. It was basically created to avoid many of the inherent problems of C++ – Adam Jan 24 '22 at 08:03
  • Be careful with assert, it only checks in debug builds. (You stated to want to have it in debug builds, so should be okay.) – Sebastian Nov 01 '22 at 18:15
  • Rust has possible UB, wherever one uses unsafe mode: https://runrust.miraheze.org/wiki/Undefined_Behavior In safe mode UB is called soundness bug instead: https://docs.rs/dtolnay/0.0.7/dtolnay/macro._03__soundness_bugs.html#soundness C# also has UB in unsafe contexts anyway and also in safe contexts: https://stackoverflow.com/questions/1860615/code-with-undefined-behavior-in-c-sharp/26173701#26173701 – Sebastian Nov 01 '22 at 18:21
  • The standardization of C++ slowly finds solutions for UB. Functions evaluated at compile-time must not exhibit UB for their actual inputs. The contracts feature is coming (compilers could often detect and warn, if some input still allowed by preconditions would lead to UB). We have modules, so the translation units are more cleanly separated and the library headers and their templates are not part of the user program. This makes it better possible to implement a safe mode, where some constructs, e.g. raw pointers or new/delete, are not allowed. There are standard proposals for different modes. – Sebastian Nov 01 '22 at 18:33
  • Try out your code with different compilers. Activate ubsan: https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html – Sebastian Nov 01 '22 at 18:40

3 Answers3

7

Is there a version of c++ and standard library which does not contain any undefined behaviour but rather throws exceptions?

No, there is not. As mentioned in a comment, there are Address Sanitizer and Undefined Behavior Sanitizer and many other tools you can use to hunt for bugs, but there is no "C++ without undefined behavior" implementation.

If you want an inherently safe language, choose one. C++ isn't it.

John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • That's a shame. – SomeProgrammer Mar 26 '21 at 12:06
  • 2
    @SomeProgrammer, it's a shame of many, if not most, languages, I believe. – Enlico Mar 26 '21 at 12:10
  • @Enlico which languages are you talking about? – SomeProgrammer Mar 26 '21 at 12:12
  • 2
    @SomeProgrammer, languages which are not inherently safe. Do you know one? By the way, any language which misses to define something in its standard admits undefined behavior. So you're really asking if there's a language the standard of which defines every possible scenario. Do you know of one? – Enlico Mar 26 '21 at 12:14
  • @Enlico Well rigorously speaking no, but in languages like Java undefined behaviour is not such a prominent aspect... And I understand that this will cause a performance slow down, and that performant c++ must have undefined behaviour, but I am willing to accepting this performant slow down when compiling in debug mode to help with debugging. – SomeProgrammer Mar 26 '21 at 12:35
  • 3
    @SomeProgrammer - The C++ standard explicitly defines the meaning of "undefined", and (where it can) specifies where behaviour is undefined. The Java spec doesn't do this, but still has (mostly ignored) examples of undefined behaviour. Java designers sought to stamp out behaviours that are undefined in C++ (by either defining a specific outcome or preventing them happening) without documenting trade-offs of doing so (e.g. runtime cost on some machines but not others, intuitive to some programmers but unintuitive to others). There is a reason for Java's label as "write once, debug everywhere". – Peter Mar 26 '21 at 14:55
  • @Enlico: Many languages define the behavior of every possible action as being, at worst, an Unspecified choice from a bounded range of possible actions. Languages which do so are more suitable for many purposes than language dialects which interpret things like integer overflow as an invitation to throw rules of time and causality out the window. – supercat Mar 30 '21 at 21:50
  • 1
    @supercat, isn't allowing throwing rules out of the window the reason why some things can be better optimized, not having to clutter them with a lot of if/else? I'm certainly less expert than you, so feel free to throw my comment out of the window :D – Enlico Mar 31 '21 at 07:04
  • @Enlico: Most useful optimizations could be achieved by treating things as an unspecified choice from a number of options. For a compiler to use the fact that code will compute `(ushort1 * ushort2) & 0xFFFFu` to infer that `ushort1` will ever exceed `INT_MAX/ushort2`, and process surrounding code in a manner which behaves nonsensically if `ushort1 > INT_MAX/ushort2`, may in some contrived scenarios allow that surrounding code to be much more efficient, but would in many more scenarios replace a program that meets requirements with a "more efficient" program that doesn't. – supercat Mar 31 '21 at 14:20
3

Undefined behavior

Undefined behavior means that your program has ended up in a state the behavior of which is not defined by the standard.¹

So what you're really asking is if there's a language the standard of which defines every possible scenario.

And I can't think of one language like this, for the simple reason that programs are run by machines, but programming languages and standards and written by humans.

Is it always unintentional?

Per the reason explained above, the standard can have unintentional "holes", i.e. undefined behavior that was not intentionally allowed, and maybe not even noticed during standardization.

However, as all the "is undefined behavior" sentences in the standard prove, many times UB is intentionally allowed.

But why? Because that means giving less guarantees to the programmer, with the benefit of being able to make more optimizations or, equivalently, to not waste time verifying that the user is sticking to a defined contract.

So, even if the standard had no holes, there would still be a lot of cases where UB is stated to happen by the standard, because compilers can take advantage of it to make all sort of optmizations.²

The impact of preventing it in some trivial case

One trivial case of undefined behavior is when you access an out-of-bound element of a std::vector via operator[]. Exactly like for C-style arrays, v[i] basically gives you back *(v_ + i), where v_ is the pointer wrapped into v. This is fast and not safe.³

What if you want to access the ith element safely? You would have to change the implementation of std::vector<>::operator[].

So what would the impact be of supporting the DEBUG_MODE flag? Essentially you would have to write two implementations separated by a #ifdef/(#else/)#endif. Obviously the two implementation can have a lot in common, so you could #-branch several times in the code. But... yeah, my bottom line is the your request can be fulfilled by changing the standard in such a way that it forces the implementers to support a two different implementations (safe and fast/unsafe and slow) for everything.

By the way, for this specific case, the standar does define another function, at, which is required to handle the out-of-bound case. But that's the point: it's another function.


(¹) In which case the standard (my emphasis)

places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation)

(²) I really recommend reading this answer by Nicol Bolas about UB being absent in constexprs.

(³) This and other examples of UB are listed in this excellent article; search for Out of Bounds for the example I made.

Enlico
  • 23,259
  • 6
  • 48
  • 102
  • what I meant was rather having undefined behaviour/ a crash/ core dump, was there some version of c++ where you could have a descriptive error message? – SomeProgrammer Mar 26 '21 at 12:21
  • 2
    A core dump is the most descriptive error message possible. It shows every detail of your program state at the time of the crash! What could be more helpful when you're trying to fix bugs? – Useless Mar 26 '21 at 12:25
  • @SomeProgrammer of what? Of the fact that you're in the land where everything is possible and nothing can be guaranteed? – Enlico Mar 26 '21 at 12:25
  • @Enlico A descriptive error message saying that I am trying to access a null pointer at line ... instead of just "segmentation fault" – SomeProgrammer Mar 26 '21 at 12:32
  • @SomeProgrammer the stack trace should give you that info. – Enlico Mar 26 '21 at 12:43
0

Is there a safe version of c++ without undefined behaviour?

No.

For example, you could implement a safe pointer class like so

How is throwing an exception safer than just crashing? You're still trying to find the bug so you can fix it statically, right?

What you wrote allows your buggy program to keep running (unless it just calls terminate, in which case you did some work for no result at all), but that doesn't make it correct, and it hides the error rather than helping you fix it.

Is there a library / safe version of C++ which does this?

Undefined behaviour is only one type of error, and it isn't always wrong. Deliberate use of non-portable platform features may also be undefined by the standard.

Anyway, let's say you catch every uninitialized value and null pointer and signed integer overflow - your program can still produce the wrong result.

If you write code that can't produce the wrong result, it won't have UB either.

Useless
  • 64,155
  • 6
  • 88
  • 132
  • Sorry my bad... I was just throwing an exception to get a descriptive error message, I was never thinking of catching it. – SomeProgrammer Mar 26 '21 at 12:16
  • Sure, but the point is that dereferencing a null pointer and crashing immediately provides _more_ information about the bug than throwing an exception or anything else. Moving this kind of error-checking into the type system (like gsl) is a better approach. – Useless Mar 26 '21 at 12:24
  • well no on my machine I just get "core dump" – SomeProgrammer Mar 26 '21 at 12:25
  • 1
    Did you try looking at the core file that was dumped? If you actually just get "Aborted" or similar, make sure core dumps are enabled and the size limit is reasonable. – Useless Mar 26 '21 at 12:26
  • Where is the core file? – SomeProgrammer Mar 26 '21 at 12:31
  • There are many situations where programs are required to behave usefully when possible, and in tolerably useless fashion when useful behavior isn't possible (e.g. because of invalid input). Indeed, most programs would be subject to such constraints, and in many of those cases forcing abnormal program termination when given invalid input would be tolerably useless, but giving control of one's computer to the fabricator of maliciously-formed input would not be. – supercat Mar 30 '21 at 21:45