Dillemma with possible undefined behaviour in project and whether to add features

Question

I have such dilemma. Maybe someone will consider this opinion based, but I am sure this is relevant for many projects and also maybe someone will give answer which is not opinion based.

Imagine I was given large C++ project, which works correctly for many years (some may say this doesn't mean much if there is UB but come on, it means something).

Now, imagine two situations:

there is either (undefined behaviour) UB in code but it still works as I mentioned
there is no UB

Let's take first scenario.

Now if there is UB (and many instances of it) and project has been working fine I guess it is best to not touch project.

But what if I want to add feature to the project, what to do?

In such case:

adding even new library to project increases risk of project to stop working well
Recompiling with new compiler increases risk to make project stop working well

So, if there is UB, in the code, and it has been working fine, what to do if one wants to add new features to it? Without increasing risks that project will stop working correctly?

Is fixing all UB realistic in large C++ project by just code inspection? If no, we go to the point, when adding new features may break working project because it has been working well so far (please see two bullets points above).

Or maybe one tells the management because of such situation it is safest to not add anything to the project.

I hope I made my question clear.

I am pretty sure this is common situation, how do people proceed in such cases?

@NickyC No I don't know this is not my project it was written many years ago — , Dec 09 '15 at 08:35
That way, it is safe to assume there is no unit test at all. Defer all schedules. Write unit test, refactor the whole thing. The UB will come up in the process. There is no other way since the project is already paralyzed. — , Dec 09 '15 at 08:41
@NickyC Does unit tests work with network communications etc this is mainly project which handles network communication database etc. — , Dec 09 '15 at 08:43
`no undefined behaviour` is `undefined behaviour` ... it just does something, maybe good, maybe bad, maybe giving you a win in the casino — Zaiborg, Dec 09 '15 at 08:44
@user200312 i know, thats why it is a comment. if you want the answer if you should touch rotten code to extend its functionality with another library ... yes you should. dig deep, refactor, cleanup and write unit tests — Zaiborg, Dec 09 '15 at 08:49
Only the engineers of the project have a chance to know whether something is testable. I'm not one of them, but you are. If the units are not testable, or even there is no unit at all, conduct refactoring. — , Dec 09 '15 at 08:58
@NickyC That is not an option like I said it is quite large project and has been working for years well. One will have to justify refactoring — , Dec 09 '15 at 09:31
Discuss with the team. At least, add *comments* inside the code near suspected UB. — Basile Starynkevitch, Dec 09 '15 at 09:41
There is a justification. And it is not only a justification. It is a driving force that force you to refactor the project. The justification/force is, doing anything (**anything!**) increases the risk that project will stop working correctly. — , Dec 09 '15 at 09:45
Discuss these concerns with the bosses, make them aware of the risks, and ask them which way they want to proceed — M.M, Dec 09 '15 at 10:13
@M.M That is only a good idea if the bosses are software developers. Otherwise don't do that. — nwp, Dec 09 '15 at 10:38
@M.M Yes bosses are software developers I hope I can explain this to them. They don't seem to get hang of it from first try - they are mainly developers from other languages e.g C# — , Dec 09 '15 at 10:39
@nwp disagree ; it's the bosses who will have to deal with the fallout if you break something — M.M, Dec 09 '15 at 10:40
@M.M Are the risks different whether I discover UB in code or not?? — , Dec 09 '15 at 10:50
@M.M Yes but how do I verify that? Is it realistic to find all UB by inspection?? — , Dec 09 '15 at 11:18
@user200312 no, you decide before adding the feature. It doesn't really matter whether there is UB or not; changing the code risks breaking things if the code is old and crappy. — M.M, Dec 09 '15 at 21:22

score 2 · Answer 1 · answered Dec 09 '15 at 10:23

2

I know about this situation. Old codebase, needs new features, and needs them now (add that it runs on a 8051, and it has only some 31 bytes free of 64K if that's not already enough). The problem is that refactoring, building complete unit tests, whatnot is not an option due to the time constraints of the project - so either it is done in whatever manner it is doable, or not done

Of course it all starts by assessing the existing code, acquire documentation and understand it, what does what, what are the interfaces within the program. If it doesn't exist, then analyze the code until you reasonably figure out its modules, their roles and how they interface with each other, and do some documentation on it.

By your question you seem to worry about whether the compiler would produce "correct" result, now by "correct" meaning that it reproduces the same code which worked, and probably tested (at least an integration test) long ago, and which might still be working fine where it is deployed.

Look around in the codebase to see whether you still have the object files of a successful compile verified in some manner. If necessary, do some research (in the company where you work, probably the original developer's computer may still be sitting around somewhere with the artifacts of a correct verified compilation). Get those.

Then create your build script making sure it doesn't touch any of the existing object files unless requested to do so. This way you don't compile them, only link, thus avoid potential miscompilations coming from potential UB in the sources. Add some warning system to this build script which detect whether a source is changed compared to the originals, so you won't end up with an incorrect link.

With these preparations, you can set up a working environment where you can pull your changes off incrementally.

If you need to touch a source file, obviously you will need to assess it entirely, use some static analysis tool if available (and obviously compile with all warnings on, so you can see whatever the compiler itself might catch).

Cancelling the project

You have to consider this, too. I assumed the strict time limits which doesn't permit a "correct" solution. Of course you will have to assess risks, and if those are too high, the correct operation of the software is too critical, as a developer understanding the risks, you have the responsibility of properly explaining them to whoever requires the thing being done. It may be better for safety concerns or the reputation of the company, whatever, to not do the project at all, but the why's must be understood by those people who are in power of making the decision.

answered Dec 09 '15 at 10:23

Jubatian

2,171
16
22

I am not sure I can get my hands on object files - old developers didn't work for this company. Yes - I think you understood situation: If I add new features old code which was working may not work because I used new compiler etc. Same with libraries. Is eliminating all UB realistic? What to do I am a bit lost – Dec 09 '15 at 10:35
Should I grab obj file to each source file? Any tutorial how to compile it. Current project is in Visual Studio. If I can't get old object files what to do? "doesn't touch any of the existing object files unless requested to do so" how to ensure this? I am not very well familiar with manual compilation – Dec 09 '15 at 10:37
Try to get the old build environment. The developer might not be working at the company, but it isn't likely that he brought his office computer with him :) It should still be sitting somewhere, maybe someone else got it, but the old stuff never deleted. You might even have some company LAN, and the build system could also be sitting somewhere in archives on a server. Try to look around, if there is an obstacle, explain why this is very important! – Jubatian Dec 09 '15 at 10:40
If I get old object file can you please explain or give me link to tutorial how to continue development from then on? As I said I am not very well familiar with manual compilation – Dec 09 '15 at 10:41
A problem is that I am only familiar with manual compilation (makefile based build systems), and I didn't touch Windows since a decade for any coding work. So I can't really help on that part (using Linux for modern stuff, DOS for legacy). – Jubatian Dec 09 '15 at 10:42
Yeah but you are suggesting I go with manual compilation too? So how to proceed in my case? – Dec 09 '15 at 10:42
It's up to you. If you can understand and control the compilation process in Visual Studio, then you can do it there. I like the command line based solutions since I can always be sure on what exactly happens there, so I can know what gets compiled, what not, and can figure out if something goes amiss. It could be important. But with a huge enough project, you might even find that you are simply tied to whatever build system the original developer used (it happened: MDS70 computer emuleted on DOS, now emulated in DOSBox for a 8051 project). – Jubatian Dec 09 '15 at 10:45
So anyway the very first step really is trying to figure out the original build process. You might introduce errors even if the code is otherwise completely standards compliant if you build it the "wrong way" simply by changing environment which stubly misses something the original had or did. – Jubatian Dec 09 '15 at 10:50
It is Visual Studio -project is compiled with VS. Like I said I didn't exactly understand how to proceed with your solution so if you can link a tutorial or explain in more details what to do either in manual or some other way I would appreciate – Dec 09 '15 at 10:50
I can't link such. These are just own experiences from the problems I myself faced, and how we could proceed working around them in some sane manner. Things where one little change could stubly wreck something apparently completely unrelated. All I can say for the safe approach, by whatever means possible, try to start off with acquiring and anyalyzing the original build system, whatever it was, whatever tools it used. – Jubatian Dec 09 '15 at 10:54
Not sure what you mean with build system like I said it was Visual Studio – Dec 09 '15 at 11:19
OK, understood, so the original build system was also (some version of) Visual Studio, and the thing compiles. Then you will have to work out that, how to control the compilation process in it. There should be some underlaying script-like system, maybe XML files by which you could achieve using existing object files instead of compiling them anew, just if there is no way to get such capability out of the IDE directly. You might also want to use that version of it which was used for the original development to minimize chances for miscompiles. But I really can't help any more here. – Jubatian Dec 09 '15 at 12:35

score 0 · Answer 2 · answered Dec 09 '15 at 08:39

0

In my opinion, removing an UB should have no negative effects. You change your code from having "random effects" to having "well specified effects". Therefore you should remove it on sight or shortly after or at least comment it and leave for next iteration/sprint/etc to fix it.

Of course, that's an opinion, although a string one, still a guess. There is a small change that it's UB for Standard, but that your compiler has a certain defined behavior for it (even if undocumented, etc). In that case, it may happen that this code relies on that behavior or its sideeffects. Removing that UB and replacing it with valid code may remove that sideeffects and introduce obvious or subtle bugs.

So if your code works, I'd say, when you notice that UB and if it looks nontrivial, analyze it well and start writing tests around it to ensure that its effects are preserved after removing that UB nothing actually changes.

answered Dec 09 '15 at 08:39

quetzalcoatl

32,194
8
68
107

Yeah but I mentioned it can be there are many UB - can you spot all of them? – Dec 09 '15 at 08:40
@user200312: Immediatelly, no, in some time, yes. But that's also matter of "price". If you think the code is in a really poor shape, it might be reasonable to rewrite even some large parts of it. It's hard to say anything more without concrete examples. Oh and about libraries - you know, even without UBs in your code introducing a library can break things, a library can have problems/UBs/etc of its own :) If you think it may be beneficial, make a fork/branch and try. Usually you will quickly know if you are on a good way and at least you will get an estimate how long whole process will take. – quetzalcoatl Dec 09 '15 at 08:46
I am not sure just I am saying I add a new feature - because of using new compiler - bah, I might make UB which was working well with old compiler reveal itself now. I add stable library it might again awaken some UB in code which was not visible till now. Fixing all UB realistic? I don't know. That is why it is dilemma. – Dec 09 '15 at 08:49
@user200312: Hm.. I know that's a little worrying thing to ask, but can you use the old compiler? – quetzalcoatl Dec 09 '15 at 10:38
I'd have to deduce which version of Visual Studio they used? And continue development from then on? How to check that?? – Dec 09 '15 at 10:42
@user200312: that should be well known, its part of the project/code to know what tools to built it with! ask them, check the docs, check vcproj files or msbuild macros, check the file dates and check wiki when each VS version was released, ... – quetzalcoatl Dec 09 '15 at 11:12
asking is not an option I guess I need other way to find out – Dec 09 '15 at 11:19

score 0 · Answer 3 · answered Dec 09 '15 at 08:47

If worried, try applying a pattern like the adapter pattern described by GOF for instance... That is to say, add an intermediary layer between your new library calls and the old subsystems, making sure you can always use that as a filter to keep calling the old stuff in the right way, while still using new code as your primary interface.

score -1 · Answer 4 · answered Dec 09 '15 at 10:06

When maintaining a large project where UB is detected, you have to consider each case of UB on individual basis. It might be harmless, it might be a severe bug.

For example, the standard will label something like left-shifting a negative integer UB, because the standard allows all manner of obscure signedness implementations. In practice though, you will have a two's completement CPU which on machine instruction level performs such a shift in a well-defined, deterministic, well-documented manner. There will be no crash and there will be no surprises. You could even write code which relied on the shift to give a certain outcome without too much guilty conscience.

The standard is similarly paranoid about a lot of things that are in practice always well-defined by the implementation: integer overflows, function pointer casts, (the lack of) trap representations etc etc. These are examples of UB that you don't need to worry about unless portability to exotic systems is a concern.
Though if you chose to rely on such "harmless" UB, you must document the behavior on the specific system, and explicitly write comments showing that you have considered the outcome, preferably with references to compiler documents.

But then of course, most cases of UB are always bugs and always need to be fixed before they cause havoc. For example a function returning a pointer to a local variable. Even if happens to work today, it is a ticking bomb. "It works today" might mean that there is a dormant, severe bug in the program, which you have yet to detect, because the bug only appears in a rare use-case.

score -2 · Answer 5 · answered Dec 09 '15 at 09:38

You should realize that not all UB's are created equal. Although it is said that UB can result in nasal demons their consequences are often more predictable and benign than that. Often UB will only alter behavior if you change compiler/platform or the level of optimization.

This means that there's normally no reason to track down all UB and replace them before it becomes natural to do so. It's natural to get rid of UB when you maintain the part of the source where the UB resides. And when doing so you should of course have unit tests to make sure that the code under change does not alter it's behavior (in undesired ways).

And in maintaining you may encounter UB's that generates nasal demons from a distant galaxy. These should hopefully be few and would have to be addressed no matter if they seem related to what you're doing at the moment.

Dillemma with possible undefined behaviour in project and whether to add features

5 Answers5