0

I have 3 files In a.h i have a #define ENABLE_STR that wraps a std::string str, I enable this macro only when defining the class A but when I am using A it is missed out.

This is a situation where a.cpp thinks that there is a str member but main.cpp doesn't know about it. And when the program runs int i is overwritten by the string str. Neither AddressSanitizer nor Valgind seem to detect this as invalid memory access.

// a.h
#pragma once 
#include <string>
class A
{
   public:
      A();
      std::string& get();
#ifdef ENABLE_STR
      std::string str;
#endif
      int i;
};

// a.cpp
#define ENABLE_STR
#include <iostream>
#include "a.h"

A::A():i(0){ }

std::string& A::get()
{
   std::cin >> str;
   return str;
}

//main.cpp
#include <iostream>
#include "a.h"

int main()
{
   A a;
   std::cout << a.get()  << "\n\n i:" << a.i << std::endl;
}
  • I would ideally assume that compiler/linker would flag this as an error but it doesnt.
  • Why could address sanitizer/valgrind not detect this, as this seems like writing over memory that is not owned by it.
  • Except not using macros like this in headers, how would one detect this?
tejas
  • 1,795
  • 1
  • 16
  • 34
  • "Except not using macros like this in headers, how would one detect this?" - It's not that that's the problem - it's the using of #define in the cpp file that's killing you – UKMonkey Jun 07 '18 at 13:30
  • 3
    Compiling a C++ program is hard enough without having to check programmers aren't abusing the preprocessor. [*"There can be more than one definition of a class type... Given such an entity named D defined in more than one translation unit, then each definition of D shall consist of the same sequence of tokens; and... If the definitions of D do not satisfy these requirements, then the behavior is undefined."*](http://eel.is/c++draft/basic.def.odr#12). That paragraph is about using inclusion to share class definitions. And the last bit places the burden of responsibility on you, to not abuse it. – StoryTeller - Unslander Monica Jun 07 '18 at 13:31
  • "And when the program runs `int I` is overwritten by the `string str`" is a bit unclear. What is your actual and expected output? – Sean Monroe Jun 07 '18 at 13:43
  • I feel like this is a duplicate: https://stackoverflow.com/questions/4192170/what-exactly-is-one-definition-rule-in-c – UKMonkey Jun 07 '18 at 13:44
  • @StoryTeller Its good that the standard mentions that this is UB, but i would expect the Compiler to at-least put up a warning here. – tejas Jun 07 '18 at 13:46
  • @UKMonkey This is simplification of of the problem. In reality they were completely different libraries Also the question doesn't answer how would one narrow down this Issue. – tejas Jun 07 '18 at 13:46
  • @acraig5075 corrected the header name. also I have to #define the macro somewhere, this example is just a simplification. – tejas Jun 07 '18 at 13:47
  • @tejas you missed the part in bold in the answer "that is used in that program". It's more general than just library – UKMonkey Jun 07 '18 at 13:49
  • 1
    @tejas - Why would expect that? The standard defines it like that to make compiler writer's jobs easier, among other things. The obvious solution is to not use hacks like that. Other than that, asking why an unmentioned compiler didn't want to implode into itself while diagnosing this, is at best unclear. – StoryTeller - Unslander Monica Jun 07 '18 at 13:52
  • The C preprocessor is the devil. This kind of use of the C preprocessor is badder-than-bad. Don't do it. – Eljay Jun 07 '18 at 14:08
  • @tejas `I would expect the Compiler to at-least put up a warning here.` How can it? It doesn't know that you defined your class differently in two different compilation units. What is the purpose of defining `class A` in two different ways anyway? Is it used in two different _programs_? (That would be OK). – Paul Sanders Jun 07 '18 at 16:39
  • @PaulSanders Class A is for example provided by an external library which when I use might not realize was compiled with a #define. – tejas Jun 08 '18 at 05:31
  • @tejas That would be one strange library. I really don't think people code that way. – Paul Sanders Jun 08 '18 at 07:13

1 Answers1

2

I would ideally assume that compiler/linker would flag this as an error but it doesnt.

You are providing the compiler with different class definitions for the same class for different translation units. This is undefined behavior, so no compiler has to diagnose it. As mentioned in the comments, compiler developers have other worries than the compiler user messing up their definitions in such a way.

On a more technical level, each translation unit emits an object file. The object files are linked together by the linker. But the object files know nothing of classes, only functions. It thus has no explicit knowledge about object sizes or member offsets.

Would it be possible to emit "compiler comments" in the object files to detect this? Or could this be part of debug symbols? Yes, but it would probably introduce significant bloat and increase compile times. Also, there is no requirement that your library binary has any of that, so it doesn't help in that case. So it would be an unreliable assistance in a rare case of the user messing up, with significant downsides.

Why could address sanitizer/valgrind not detect this, as this seems like writing over memory that is not owned by it.

I don't know enough about the inner workings of valgrind to give a good answer here, but presumably the str access that get assumes to be where i actually is inside a in main does not seem immediately suspicious to valgrind because the start of str is still within the memory allocated for A. If you only get a single character, small string optimizations might also cause main to never access A outside those first few bytes reserved for the int.

Except not using macros like this in headers, how would one detect this?

It's a horrible idea to use macros in such a way - precisely because these problems are almost impossible to detect. You mentioned some tools yourself that may quickly catch "malign" undefined behavior (e.g. a std::vector trying to manage memory after having its control structure overwritten), and you can both configure your operating system and compiler to instrument your program more stringently to get notified (e.g. -fstack-protector-all on gcc, /Gs and /RTCs on MSVC, these safety features on newer Windows and so on) when it does something dubious.

Nevertheless, this is still undefined behavior we are talking about, so there is no guaranteed way to find the problem, mainly because "everything works as expected when you try to identify problems" is still within the realm of "everything could happen". Or, in other words, there may simply be no symptoms present that are caught by these tools, even when your program still does subtly wrong things.

Max Langhof
  • 23,383
  • 5
  • 39
  • 72
  • My point about compiler was that it has all the info needed and people usually have debug and release builds, it could be included in probably just debug. sometimes the slowness is acceptable when we are debugging a crash. I fiddled around with gcc flags and found that though it cant detect this exact issue but the symptoms can be detected using stack protector flag `-fstack-protector-all`. Could you update your answer with this compiler flag and I will accept it. – tejas Jun 08 '18 at 06:25
  • I elaborated a bit on diagnosing this by symptoms. – Max Langhof Jun 08 '18 at 08:01