94

I'm trying to compile this piece of code from the book "The C Programming Language" (K & R). It is a bare-bones version of the UNIX program wc:

#include <stdio.h>

#define IN   1;     /* inside a word */
#define OUT  0;     /* outside a word */

/* count lines, words and characters in input */
main()
{
    int c, nl, nw, nc, state;

    state = OUT;
    nl = nw = nc = 0;
    while ((c = getchar()) != EOF) {
        ++nc;
        if (c == '\n')
            ++nl;
        if (c == ' ' || c == '\n' || c == '\t')
            state = OUT;
        else if (state == OUT) {
            state = IN;
            ++nw;
        }
    }
    printf("%d %d %d\n", nl, nw, nc);
}

And I'm getting the following error:

$ gcc wc.c 
wc.c: In function ‘main’:
wc.c:18: error: ‘else’ without a previous ‘if’
wc.c:18: error: expected ‘)’ before ‘;’ token

The 2nd edition of this book is from 1988 and I'm pretty new to C. Maybe it has to do with the compiler version or maybe I'm just talking nonsense.

I've seen in modern C code a different use of the main function:

int main()
{
    /* code */
    return 0;
}

Is this a new standard or can I still use a type-less main?

lost_in_the_source
  • 10,998
  • 9
  • 46
  • 75
César
  • 9,939
  • 6
  • 53
  • 74
  • 4
    Not an answer, but another piece of code to look at more closely, `|| c = '\t')`. Does that seem the same as the other code on that line? – user7116 Dec 27 '11 at 03:19
  • Sorry it was a typo. I'mt still getting that error – César Dec 27 '11 at 03:21
  • You need to use an up to date version of the book – David Heffernan Dec 27 '11 at 08:21
  • 2
    @DavidHeffernan - The 1988 edition is the revised (second edition) ANSI C version of the book. To the best of my knowledge that is still a totally valid version of the book [besides, there is no third edition of it...] – mac Dec 27 '11 at 13:53
  • @mac `main()` does not bode well for modern C – David Heffernan Dec 27 '11 at 14:18
  • 58
    32 upvotes for a debugging + typo question?! – Lightness Races in Orbit Dec 27 '11 at 15:48
  • 37
    @TomalakGeret'kal: you know, old stuff is valued more (wine, paintings, C code) – Sergio Tulentsev Dec 27 '11 at 17:29
  • 3
    BTW, yes, the new C standard defines the "main" function as "int main(int, char**)"; you should stick to it, and modern compilers will warn you about this. – Massimo Dec 27 '11 at 19:38
  • 1
    Thanks @Massimo I guess I need an up to date book after reading this one :) – César Dec 27 '11 at 22:57
  • 3
    `main` is defined either as `int main(void)`, or as `int main(int argc, char**argv)`. As of the 1990 stand, the `int` return type could be omitted, but C99 removed that rule -- and omitting it was never a particularly good idea. – Keith Thompson Dec 27 '11 at 23:54
  • 2
    @TomalakGeret'kal what's wrong with upvoting debugging + typo? Is that ilegal? We are not all C gurus! Thank God not everybody share your thoughts. Next time please keep that to yourself – César Dec 28 '11 at 15:14
  • 16
    @César: I am quite within my rights to express my opinion, and I'll thank you not to try to censor it. As it happens, yes, this is not a website for debugging your code and solving your typographical errors, which are "localised" issues that will never help anybody else. It's a website for _questions about programming languages_, not for doing your basic debugging and reference work for you. Skill level is completely irrelevant. Read the FAQ, and perhaps also [this meta question](http://meta.stackexchange.com/questions/105365/stack-overflow-has-too-many-too-localised-new-questions). – Lightness Races in Orbit Dec 28 '11 at 15:18
  • 11
    @TomalakGeret'kal of course you can express your opinion and I won't censor your comment in spite of being unconstructive. I've already read the FAQ. I'm an **enthusiast programmer** asking about an **actual problem that I'm facing** – César Dec 28 '11 at 15:26

9 Answers9

247

Your problem is with your preprocessor definitions of IN and OUT:

#define IN   1;     /* inside a word */
#define OUT  0;     /* outside a word */

Notice how you have a trailing semicolon in each of these. When the preprocessor expands them, your code will look roughly like:

    if (c == ' ' || c == '\n' || c == '\t')
        state = 0;; /* <--PROBLEM #1 */
    else if (state == 0;) { /* <--PROBLEM #2 */
        state = 1;;

That second semicolon causes the else to have no previous if as a match, because you are not using braces. So, remove the semicolons from the preprocessor definitions of IN and OUT.

The lesson learned here is that preprocessor statements do not have to end with a semicolon.

Also, you should always use braces!

    if (c == ' ' || c == '\n' || c == '\t') {
        state = OUT;
    } else if (state == OUT) {
        state = IN;
        ++nw;
    }

There is no hanging-else ambiguity in the above code.

user7116
  • 63,008
  • 17
  • 141
  • 172
  • 9
    For clarity, the problem isn't the spacing, it's the semicolons. You don't need them in preprocessor statements. – Dan Dec 27 '11 at 03:21
  • @Dan thanks for the clarification! And the semicolons were indeed the problem! Thanks guys! – César Dec 27 '11 at 03:23
  • 2
    @César: you're welcome. The bracing suggestion will hopefully keep you out of trouble in the future, certainly has helped me! – user7116 Dec 27 '11 at 03:33
  • 5
    @César: It's also a good idea to get used to putting parenthesis around macros since you generally want the macro to be evaluated first. In this case it doesn't matter since the value is a single token, but leaving out parens can lead to unexpected results when defining an expression. – styfle Dec 27 '11 at 08:25
  • 7
    "don't need them" != "shouldn't have them". the former is always true; the latter is context-dependent and is the more pertinent issue in this scenario. – Lightness Races in Orbit Dec 27 '11 at 15:46
  • Note that the pre-processor doesn't know about C comments, so the comments will be included in the macro expansion. Tip: Put comments on a separate row above the macro. Above all: never ever use `//` comments in a macro (unless you are participating in a C obfuscation contest). – Klas Lindbäck Nov 09 '16 at 13:04
64

The main problem with this code is that it is not the code from K&R. It includes semicolons after the macros definitions, which were not present in the book, which as others have pointed out changes the meaning.

Except when making a change in an attempt to understand the code, you should leave it alone until you do understand it. You can only safely modify code you understand.

This was probably just a typo on your part, but it does illustrate the need for understanding and attention to details when programming.

jmoreno
  • 12,752
  • 4
  • 60
  • 91
  • 9
    Your advice isn't terribly constructive for someone learning to program. Modifying code is precisely how you understand the details of programming. – user7116 Dec 27 '11 at 21:49
  • 12
    @sixlettervariables: And when doing so, you should know what changes you've made, and make as few changes a possible. If the OP had made the changes deliberately, and made as few change as possible, he probably wouldn't have asked this question, as it would have been clear to him what was going on. He would have changed the macro for IN, with no errors and then the macro for OUT with the two errors, the second of which would be complaining about the semicolon he had just added. – jmoreno Dec 27 '11 at 22:07
  • 5
    It seems like unless you make the mistake of including a semicolon on the end of a preprocessor directive line, you likely wouldn't know that you aren't to include them. You could take it at face value, you could read lots of code and notice they never seem to be there. Or, the OP could mess up by including them, ask about the "bizarre" error, and find out: oops, no semicolons required for preprocessor directives! This is programming, not an episode of Scared Straight. – user7116 Dec 27 '11 at 22:17
  • 1
    @jmoreno I totally understand the code that's why the title is "What's wrong with the code" and not "What does this code do" – César Dec 27 '11 at 22:43
  • 1
    And it totally agree with @sixlettervariables about modifying the code, but in this case I was trying to write the code from scratch and not trying to modify it. – César Dec 27 '11 at 22:52
  • 14
    @sixlettervariables: Yes, but when the code doesn't work, the obvious first step is to go "oh, ok, then what I changed without any reason whatsoever from the code written in a book by the inventor of C, was probably the issue. I'll just undo that then." – Lightness Races in Orbit Dec 28 '11 at 00:42
  • 2
    @TomalakGeret'kal: if he'd copy and pasted the code he wouldn't have learned as much about the preprocessor today. Besides, it's not like that book doesn't have an erratum. – user7116 Dec 28 '11 at 01:24
  • @sixlettervariables: At the very least, you don't need to be an expert programmer to indicate in the question itself that you're not showing us the actual code from the book. It's misleading as it stands. – Lightness Races in Orbit Dec 28 '11 at 01:28
  • 2
    @TomalakGeret'kal: I figured it was an unintentional error during transcription that he missed due to not understanding the significance or differences of the preprocessor. Why this has become such a huge deal, I'm not sure. – user7116 Dec 28 '11 at 01:32
  • @sixlettervariables: Well, you started it by suggesting that this answer is not helpful! All I'm saying is that I think it certainly is. Yes, we all learn by failing, but suggesting that developing a habit for non-analytical techniques is somehow the proper way to be a programmer is just folly. – Lightness Races in Orbit Dec 28 '11 at 01:46
  • 2
    @TomalakGeret'kal: I doubt this answer will be helpful to budding programmers (it's sound advice to someone programming at a company though). Modifying code and running into problems is one of the best ways to learn. Right brain/left brain, some folks just won't notice unless it's pointed out for them. – user7116 Dec 28 '11 at 01:52
  • 1
    @sixlettervariables: It's like you're ignoring my comments :( Oh well! – Lightness Races in Orbit Dec 28 '11 at 01:54
  • 3
    let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/6174/discussion-between-sixlettervariables-and-tomalak-geretkal) – user7116 Dec 28 '11 at 01:54
  • @sixlettervariables: including a semicolon as PART of a preprocessor directive is not a mistake. When expanded it may or may not be syntactically valid, but that doesn't make it a mistake, it may just mean you shouldn't use it there. – jmoreno Jan 22 '12 at 00:09
34

There should not be any semicolons after the macros,

#define IN   1     /* inside a word */
#define OUT  0     /* outside a word */

and it should probably be

if (c == ' ' || c == '\n' || c == '\t')
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
onemach
  • 4,265
  • 6
  • 34
  • 52
24

The definitions of IN and OUT should look like this:

#define IN   1     /* inside a word  */
#define OUT  0     /* outside a word */

The semicolons were causing the problem! The explanation is simple: both IN and OUT are preprocessor directives, essentially the compiler will replace all occurrences of IN with a 1 and all occurrences of OUT with a 0 in the source code.

Since the original code had a semicolon after the 1 and the 0, when IN and OUT got replaced in the code, the extra semicolon after the number produced invalid code, for instance this line:

else if (state == OUT)

Ended up looking like this:

else if (state == 0;)

But what you wanted was this:

else if (state == 0)

Solution: remove the semicolon after the numbers in the original definition.

Óscar López
  • 232,561
  • 37
  • 312
  • 386
8

As you see there was a problem in macros.

GCC has option for stopping after pre-processing. (-E) This option is useful to see the result of pre-processing. In fact the technique is an important one if you are working with large code base in c/c++. Typically makefiles will have a target to stop after pre-processing.

For quick reference : The SO question covers the options -- How do I see a C/C++ source file after preprocessing in Visual Studio?. It starts with vc++, but also has gcc options mentioned down below.

Community
  • 1
  • 1
Jayan
  • 18,003
  • 15
  • 89
  • 143
7

Not exactly a problem, but the declaration of main() is also dated, it should be like something this.

int main(int argc, char** argv) {
    ...
    return 0;
}

The compiler will assume an int return value for a function w/o one, and I'm sure the compiler/linker will work around the lack of declaration for argc/argv and the lack of return value, but they should be there.

BillRobertson42
  • 12,602
  • 4
  • 40
  • 57
  • 3
    That's a good book - one of the only two worth while books on C as far as I know. I'm pretty sure that newer editions are ANSI C compliant (probably pre C99 ANSI C). The other worth while book on C is Expert C Programming Deep C Secrets by Peter van der Linden. – BillRobertson42 Dec 27 '11 at 04:18
  • I never said it was. I was simply commented that to bring it in line with the way things are done today, that main should be changed. – BillRobertson42 Dec 27 '11 at 15:11
4

Try adding explicit braces around code blocks. The K&R style can be ambiguous.

Look at line 18. The compiler is telling you where the issue is.

    if (c == '\n') {
        ++nl;
    }
    if (c == ' ' || c == '\n' || c == '\t') { // You're missing an "=" here; should be "=="
        state = OUT;
    }
    else if (state == OUT) {
        state = IN;
        ++nw;
    }
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 2
    Thanks! Actually, the code worked without the braces in the second if :) – César Dec 27 '11 at 03:26
  • 5
    +1. Not just ambiguous but somewhat dangerous. When (if) you add a line to your `if` block later on, if you forget to add the braces because your block is now more than one line, it can take a while to debug that error... – The111 Dec 27 '11 at 03:26
  • 8
    @The111 Never, ever, happened to me. I still don’t believe that this is a real problem. I’ve been using the brace-less style for over a decade, I’ve never once forgot to add the braces when expanding the body of a block. – Konrad Rudolph Dec 27 '11 at 09:59
  • 1
    @The111: In this case it took a few SO contributors a handful of minutes :P And if you're a programmer who is capable of adding statements to an `if` clause and "forgetting" to update the braces then, well, you're not a very good programmer. – Lightness Races in Orbit Dec 27 '11 at 15:47
3

A simple way is to use brackets like {} for each if and else:

if (c == '\n'){
    ++nl;
}
if (c == ' ' || c == '\n' || c == '\t')
{
    state = OUT;
}
else if (state == OUT) {
    state = IN;
    ++nw;
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Nauman Khalid
  • 852
  • 9
  • 15
2

As other answers pointed out, the problem is in #define and semicolons. To minimize these problems I always prefer defining number constants as a const int:

const int IN = 1;
const int OUT = 0;

This way you get rid of many problems and possible problems. It is limited by just two things:

  1. Your compiler has to support const - which in 1988 wasn't generally true, but now it's supported by all commonly used compilers. (AFAIK the const is "borrowed" from C++.)

  2. You can't use these constants in some special places where you would need a string-like constant. But I think your program isn't that case.

Al Kepp
  • 5,831
  • 2
  • 28
  • 48
  • An alternative I prefer is enums - they can be used in the special places (like array declarations) that `const int` can't in C. – Michael Burr Feb 22 '12 at 19:54