Explanation of C++ FAQ's unsafe macro?

Question

According to the C++ FAQ, macros are evil:

[9.5] Why should I use inline functions instead of plain old #define macros?

Because #define macros are evil in 4 different ways: evil#1, evil#2, evil#3, and evil#4. Sometimes you should use them anyway, but they're still evil. Unlike #define macros, inline functions avoid infamous macro errors since inline functions always evaluate every argument exactly once. In other words, invoking an inline function is semantically just like invoking a regular function, only faster:
// A macro that returns the absolute value of i
#define unsafe(i)  \
        ( (i) >= 0 ? (i) : -(i) )

// An inline function that returns the absolute value of i
inline
int safe(int i)
{
  return i >= 0 ? i : -i;
}

int f();

void userCode(int x)
{
  int ans;

  ans = unsafe(x++);   // Error! x is incremented twice
  ans = unsafe(f());   // Danger! f() is called twice

  ans = safe(x++);     // Correct! x is incremented once
  ans = safe(f());     // Correct! f() is called once
}
Also unlike macros, argument types are checked, and necessary conversions are performed correctly.

Macros are bad for your health; don't use them unless you have to.

Can someone explain why is unsafe(x++) increments x twice? I am not able to figure out.

First time in the condition, 2nd time in the branches of the ternary operator. — nhahtdh, Apr 26 '13 at 19:05
Macros are *not* bad. If one doesn't understand a language construct and misuses (abuses) then almost every language construct can come under "..... is evil" category. — P.P, Apr 26 '13 at 19:10
macros is as evil as goto. If you do not understand, dont use it, if you do understand, that it is very very useful. — Jonatan Goebel, Apr 26 '13 at 19:10
To expand on Jonatan Goebel's thought: macros can be dangerous and confusing, but they have their place. Don't use a macro to do something that you can achieve simply in a less dangerous and confusing way. — Amish Programmer, Apr 26 '13 at 19:21
@KingsIndian, macros are evil if only because their effects aren't bounded reasonably. Unintended consequences are far too easy to cause and can't always be predicted, even when you're careful. For example if you have a class with a `GetText` function, be careful when you use it in a Windows program - Microsoft does `#define GetText GetTextW` which might cause a mismatch if you link against a library that doesn't use Windows. — Mark Ransom, Apr 26 '13 at 19:22
Agreeing with @Jonatan, I get really frosted by this judgementalism that assumes programmers are stupid. I don't mind mentioning what the pitfalls are, but this business of calling things evil is just adolescent nonsense. — Mike Dunlavey, Apr 26 '13 at 19:29
Macros do indeed have their place in C, when used carefully, but... C macros _are_ about as awful as a macro system can possibly be, and still be called a macro system. Just because some people overstate the case against ('evil' is a bit strong), doesn't mean the case _for_ C macros should be advanced with anything resembling enthusiasm. — Norman Gray, Apr 26 '13 at 22:14
C macros are not evil. They should be used for boilerplate code generation which otherwise would be unreadable and unmaintainable. But of course they should not be used in place of inline functions and such. My favourite example of the idiomatic macros use is .inc and .def files in LLVM and Clang. — SK-logic, Apr 26 '13 at 22:35
Actually, in almost every case, templates should be used for boilerplate code generation--not macros. Macros are really only good for syntactically strange things that can't be represented as a normal function. — StilesCrisis, Apr 26 '13 at 23:05
I was once shown a nearly complete implementation of C++ V1.0 done completely with macros on a C compiler. That was definitely `EVIL`, in the sense of being dangerous to the uninitiated, and fragile. — Pieter Geerkens, Apr 26 '13 at 23:23
@JonatanGoebel and perhaps you can show us a case where gotos are actually useful? Pointless generalizations are pointless. Perhaps you should read Dijkstra's paper which describes *why* gotos are best avoided. Gotos are considered harmful because they serve no useful purpose, they let us do nothing that can't be done in other, safer and more convenient ways in today's programming languages. And inevitably, newbies go "oh, but it can't be all that bad, I'm sure they have their uses", and no, unless you're working in C, I know of no situation where a goto is the best tool for the job — jalf, Apr 27 '13 at 10:16
Which, incidentally, is very different from macros. Macros are tricky to use correctly, and really should be avoided wherever possible, but they do have their uses. There are things you can do with macros that would not otherwise be possible. Equating the two shows a frightening degree of ignorance. — jalf, Apr 27 '13 at 10:17
@StilesCrisis, no, templates cannot be used for boilerplate code generation in cases I've mentioned: e.g., list of enums, for which you have to generate case statements in a variety of contexts and a type definition. Something like this cannot be done by the too limited C++ templates, but is pretty safe and straightforward with macros. — SK-logic, Apr 27 '13 at 15:32
@jalf, gotos are *extremely* useful and any language without goto support is quite handicap. You cannot implement a proper state machine without goto, because state transition is, *surprise*, semantically equivalent to goto, not to a `case` in a `switch` or anything else equally pathetic. You cannot implement high level optimisations for your [e]DSL compiler if your target language does not support goto. Goto haters are so brainwashed! — SK-logic, Apr 27 '13 at 15:36
all the people who have implemented state machines without gotos are going to be *extremely* surprised to hear that. — jalf, Apr 27 '13 at 18:45
@jalf, I said "*proper*" state machine, not a stupid leaky abstraction with switch or array of function pointers. Take a look at Knuth's `adventure` for example. — SK-logic, Apr 28 '13 at 17:36
And I say, once again, all the people who have done just that are going to be very surprised. I'll also say that many people who have used gotos for this are going to be quite surprised that it is apparently no longer a leaky abstraction. That it just works and won't in any way come back to bite you. (And, quite impressively, that the ultimate proof is in a code sample from 1975) — jalf, Apr 28 '13 at 17:43
@jalf, Knuth's code is still a right way to go (and it had been published long after Dijkstra's funny rant). And yes, goto is not a leaky abstraction in this case - it's exactly the problem domain semantics as it is. Goto is ultimately a state transition. Anything else is just a pathetic simulation of it. And it's funny that you ignored a much more important use of `goto` I've mentioned - in the generated optimised code. E.g., a parser generated out of a declarative high level definition. — SK-logic, Apr 28 '13 at 18:23
P.S., for a more modern (but much less elegant) example of code, just do `grep -r goto` inside Linux kernel source tree. You'll be surprised. — SK-logic, Apr 28 '13 at 18:25
not leaky? You do know there are quite a few limitations to how and where you can use gotos, and what happens to variables whose lifetimes span across labels, yes (especially with non-POD types)? I did ignore the optimization part, yes, because I'm talking about code you write yourself. Gotos in generated code is a different matter (because readability and maintainability isn't a concern, and in this sense it's just another name for a jump instruction). — jalf, Apr 29 '13 at 07:06
Anyway, until one of us is willing to take the time to implement the same code with and without gotos so we can make an actual apples to apples comparison, this is a waste of time. — jalf, Apr 29 '13 at 07:21
@jalf, code you write yourself will write the generated code. In the same language. And if this language is such a pathetic crap that it does not even support goto, then your optimisation options are limited and code generation is much more complicated then necessary. As for "limitations", remember, each state processing is isolated (no need to worry about variables life span). Again, read Knuth's code. — SK-logic, Apr 29 '13 at 08:00
@jalf, nobody is ever going to even try to re-write Linux or LLVM or whatever else without goto. Because goto is nice and readable, and without goto the same code looks like crap. Which simply proves my point: gotos are necessary in any imperative language, and those who do not use them because of some stupid religion are missing many of the opportunities and are incapable of writing and maintaining clean and efficient code. Dixi. — SK-logic, Apr 29 '13 at 08:02
If `goto` was that evil, why would Microsoft have implemented it in their C# compiler? — Mr Lister, Jul 15 '14 at 19:00
Why the hell are we having the same argument we've had a thousand times already? macros, `goto`, whatever. Do any of you think you're saying anything new? The question was simply asking why `unsafe(x++)` is problematic. I really think most of you should just delete your comments. (I guess I'm not being very novel here either!) — Aaron McDaid, Jan 29 '15 at 17:38
`++` is evil because it can cause undefined behaviour. E.g. `int x = INT_MAX; x++; /* <-- undefined behaviour */`. "evil" is overused in this context. Just explain the pitfalls and explain the potential problems in a factual way. — Brandin, Sep 04 '15 at 06:38
Part of the [rationale for statement expressions, which is an extension](https://stackoverflow.com/a/18885626/1708801) was safe macros. — Shafik Yaghmour, Nov 27 '18 at 06:43

score 69 · Accepted Answer · edited May 23 '17 at 10:28

Running it through the preprocessor shows the problem. Using gcc -E (can also use cpp -P, where the -P option also suppresses generated # lines),

inline
int safe(int i)
{
  return i >= 0 ? i : -i;
}

int f();

void userCode(int x)
{
  int ans;

  //    increment 1      increment 2 (one of these)
  //        |             |     |
  //        V             V     V
  ans = ( (x++) >= 0 ? (x++) : -(x++) );
  ans = ( (f()) >= 0 ? (f()) : -(f()) );

  ans = safe(x++);
  ans = safe(f());
}

As artless noise notes, the function f() is also called twice by the unsafe macro. Perhaps it's pure (has no side-effects) so it's not wrong, per se. But still suboptimal.

So, since inline functions are generally safer than function-like macros because they work on the same semantic level with the other basic elements: variables and expressions; and for manifest constants, enums can often be more tidy; what are the good uses of macros?

Setting constants known only at compile-time. You can define a macro from the command-line when compiling. Instead of

#define X 12

in the source file, you can add

-DX=12

to the cc command. You can also #undef X from the command-line with -UX.

This allows things like conditional-compilation, eg.

#if X
   do this;
#else
   do that;
#endif
   while (loop);

to be controlled by a makefile, itself perhaps generated with a configure script.

X-Macros. The most compelling use for X-Macros, IMO, is associating enum identifiers with printable strings. While it make look funny at first, it reduces duplication and synchronization issues with these kinds of parallel definitions.

#define NAMES(_) _(Alice) _(Bob) _(Caravaggio) _(DuncanIdaho)
#define BARE(_) _ ,
#define STRG(_) #_ ,
enum { NAMES(BARE) };
char *names[] = { NAMES(STRG) };

Notice that you can pass a macro's name as an argument to another macro and then call the passed macro by using the argument as if it were itself a macro (because it is one). For more on X-Macros, see this question.

@Koushik correct - there is a sequence point at the `?` (i.e. after the evaluation of the condition) but before the execution of one of the two branches. — Nik Bougalis, Apr 26 '13 at 19:13
You could highlight the duplicate function calls as well. Even if they are *pure*, they are a waste of time. — artless noise, Apr 26 '13 at 23:25
Here's a 'once-only' macro that solves this problem for common lisp: http://www.gigamonkeys.com/book/macros-defining-your-own.html . Doug Hoyte also has a version: http://letoverlambda.com/textmode.cl/guest/chap3.html#sec_6 — Clayton Stanley, Apr 27 '13 at 03:39

score 17 · Answer 2 · answered Apr 26 '13 at 19:05

17

Macros effectively do a copy/paste before the program is compiled.

unsafe(x++)

Would become

( (x++) >= 0 ? (x++) : -(x++) )

answered Apr 26 '13 at 19:05

Drew Dormann

59,987
13
123
180

I would probably describe it more as a find/replace all (with a couple more rules than a basic text search). – Bob Apr 27 '13 at 04:09

score 10 · Answer 3 · answered Apr 26 '13 at 19:05

10

The preprocessor replaces the macro before compilation.

The compiler sees this:

  ( (x++) >= 0 ? (x++) : -(x++) )

answered Apr 26 '13 at 19:05

Andy Thomas

84,978
11
107
151

Kaz · Answer 4 · 2013-04-26T23:58:15.797

unsafe(x) evaluates the expression x twice. Once to determine its truth value, and then a second time in one of the two branches of the ternary operator. The inline function safe receives an evaluated argument: the expression is evaluated once prior to the function call, and the function call works with local variables.

unsafe is actually not quite as unsafe as it could be. The ternary operator introduces a sequence point between evaluating the test, and evaluating either the consequent or alternative expression. unsafe(x++) will reliably increments x twice, though, of course, the problem is that this behavior is unexpected. In general, macros which expand an expression more than once do not have this assurance. Usually, they produce outright undefined behavior!

Circa 1999 I produced a library module module for catching uses of macros with side effects.

So, you can write "evil" macros and use them, and the machine will catch situations where they are accidentally used with arguments that have side effects (provided you have adequate code coverage to hit those uses at run-time).

Here is the test program, unsafe.c. Note that it includes a header file sfx.h and uses a SFX_CHECK macro in the expansion token sequence of unsafe:

#include "sfx.h"

#define unsafe(i)  \
          ( (SFX_CHECK(i)) >= 0 ? (i) : -(i) )

inline
int safe(int i)
{
  return i >= 0 ? i : -i;
}

int f(void)
{
  return 0;
}

int main(void)
{
  int ans;
  int x = 0;

  ans = unsafe(x++);   // Error! x is incremented twice
  ans = unsafe(f());   // Danger! f() is called twice

  ans = safe(x++);     // Correct! x is incremented once
  ans = safe(f());     // Correct! f() is called once
}

We compile everything and run from a Linux shell prompt:

$ gcc unsafe.c hash.c except.c sfx.c -o unsafe
$ ./unsafe
unsafe.c:22: expression "x++" has side effects
unsafe.c:23: expression "f()" may have side effects

Note that x++ certainly has side effects, whereas f may or may not. So the messages are differently worded. A function being called twice is not necessarily an issue, because a function might be pure (have no side effects).

You can get that here if you're curious about how it works. There is a run-time penalty if the debugging is enabled; of course SFX_CHECK can be disabled so it does nothing (similarly to assert).

The first time an SFX_CHECK protected expression is evaluated, it is parsed in order to determine whether it might have side effects. Because this parsing is done without any access to symbol table information (how identifiers are declared), it is ambiguous. The parser pursues multiple parsing strategies. Backtracking is done using an exception-handling library based on setjmp/longjmp. The parse results are stored in a hash table keyed on the textual expression for faster retrieval on future evaluations.

Explanation of C++ FAQ's unsafe macro?

4 Answers4

Linked

Related