0

So I have the (mostly vilified) #define MAX( a, b ) ( ((a) > (b)) ? (a) : (b) ) somewhere in a program (yes, yes, I know). At some point in the code there is a comparison X>-1?, where X is (as far as I can tell) a (signed) integer. The line is j += MAX(bmGs[i], bmBc[(int)y[i + j]] - m + 1 + i);, where y here is a char*. Not necessarily surprisingly, I find that the macro is returning -1 as the larger number (I'm guessing too long a number for int or an unsigned issue, but I can't find it). I would like to know techniques you guys may have for finding these kinds of errors.

Notice that I'm not asking for programming advice about whether or not to use that macro, I'm sure folks are dying to tell me I should refrain from things like that, but the question is going somewhere else.

Thanks.

Dervin Thunk
  • 19,515
  • 28
  • 127
  • 217
  • 2
    Could you please update it with some code having proper variable types. By looking at just the mentioned part, I don't see any problem with the macro. Have you checked the preprocessed code? – Vikram.exe Jan 01 '11 at 15:27

3 Answers3

4

I think your problem may be that standard C will compare a signed integer and an unsigned integer by promoting -1 to an unsigned quantity (value preserving instead of sign preserving), which in turn means that X is never greater than the converted -1, and the comparison is always false. If I'm correct, the macro is a red herring; the problem is not the macro per se (which is written correctly and works fine as long as the types are sensible and neither argument has side effects).

Is this code the Boyer-Moore string search, as described by Christian Charras and Thierry Lecroq?

If bmGs is unsigned but bmBc is signed or vice versa, you'd have problems, but both arrays are signed integers in the original.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks, Jonathan. That's exactly right, it's BM by Charras et al. Problem is, I can't seem to find the "unsignedness" that would bring this bug up, both arrays are signed, which makes me wonder where on earth the bug is. Of course, if I use a function, instead of the macro, the whole thing works. I'm wondering if the pointer to `char` is doing something funky, but it's cast to `int`! Hence the question. – Dervin Thunk Jan 01 '11 at 23:24
  • @Dervin: what platform are you on? Are plain `char` values signed or unsigned? – Jonathan Leffler Jan 02 '11 at 02:11
3

If you're using GCC, compile with the "-E" (that's capital "E") flag. This will run only the preprocessor stage, sending the output to standard output, then stop.

This will let you see how your macros are being evaluated. If you really want - you can take this output (redirect it to a file, then) modify the file and re-compile it - to play around with things and try to get it to work.

Dervin Thunk
  • 19,515
  • 28
  • 127
  • 217
Brad
  • 11,262
  • 8
  • 55
  • 74
2

I'd guess that one of your m or i is a size_t or something like that? Then the second arg of the MAX macro is size_t, too. Perhaps it thus evaluates to (size_t)-1?

In any case this code looks suspicious. Index arrays such as your y[] that you'd have to cast from char to int smell of a weak type design. Indexing with char without taking care of the possible signedness problems is just dangerous. Index types should always be unsigned types.

A macro like your MAX that makes implicit assumptions that both expressions have the same signedness and width is also dangerous. Replace it with a function intmax_t MAX(intmax_t, intmax_t) or something like that.

Redesign the types properly and your problem will solve itself.

clang as a compiler is quite good for debugging macros. Other than gcc e.g he is able to keep track of the expansion of macros and is sometimes able to put your nose on a problematic macro expansion.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177