4

I am doing a simple program that should count the occurrences of ternary operator ?: in C source code. And I am trying to simplify that as much as it is possible. So I've filtered from source code these things:

  1. String literals " "
  2. Character constants ' '
  3. Trigraph sequences ??=, ??(, etc.
  4. Comments
  5. Macros

And now I am only counting the occurances of questionmarks.

So my question question is: Is there any other symbol, operator or anything else what could cause problem - contain '?' ?

Let's suppose that the source is syntax valid.

Cœur
  • 37,241
  • 25
  • 195
  • 267
ITman
  • 141
  • 2
  • 12

3 Answers3

4

I think you found all places where a question-mark is introduced and therefore eliminated all possible false-positives (for the ternary op). But maybe you eliminated too much: Maybe you want to count those "?:"'s that get introduced by macros; you dont count those. Is that what you intend? If that's so, you're done.

Bernd Elkemann
  • 23,242
  • 4
  • 37
  • 66
3

Run your tool on preprocessed source code (you can get this by running e.g. gcc -E). This will have done all macro expansions (as well as #include substitution), and eliminated all trigraphs and comments, so your job will become much easier.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • +1 for using good already-existing software to eliminate the problem – orlp Mar 16 '11 at 12:48
  • definite +1. although OP has accepted what I wrote this could be what he or someone else in similar situations needs. it all comes down to what he/she wants to count. – Bernd Elkemann Mar 16 '11 at 12:50
  • calling gcc and other are not permited and availibility not guaranteed. Analyzator is Perl script... – ITman Mar 16 '11 at 12:54
-1

In K&R ANSI C the only places where a question mark can validly occur are:

  1. String literals " "
  2. Character constants ' '
  3. Comments

Now you might notice macros and trigraph sequences are missing from this list.

I didn't include trigraph sequences since they are a compiler extension and not "valid C". I don't mean you should remove the check from your program, I'm trying to say you already went further then what's needed for ANSI C.

I also didn't include macros because when you're talking about a character that can occur in macros you can mean two things:

  1. Macro names/identifiers
  2. Macro bodies

The ? character can not occur in macro identifiers (http://stackoverflow.com/questions/369495/what-are-the-valid-characters-for-macro-names), and I see macro bodies as regular C code so the first list (string literals, character constants and comments*) should cover them too.

* Can macros validly contain comments? Because if I use this:

#define somemacro 15 // this is a comment

then // this is a comment isn't part of the macro. But what if I would compiler this C file with -D somemacro="15 // this is a comment"?

orlp
  • 112,504
  • 36
  • 218
  • 315
  • trigraph sequences are valid C. See 5.2.1.1 in [the standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf) – pmg Mar 16 '11 at 12:29
  • 1
    -1 This whole post is incorrect. And this is why you should refer to **ISO C** and not some K&R mumbo ANSI jumbo. Trigraphs are perfectly valid standard C. Read all about them in the C standard chapter 5.2.1.1 or in K&R **2nd** edition. Also for your information, K&R 1st and 2nd editions know nothing about // comments. They aren't valid K&R mumbo ANSI jumbo C. – Lundin Mar 16 '11 at 12:29
  • I'm sorry. I am new to C, and I just worked through K&R __1st__ version. It was stupid for me to think that all that information (despite still being a good book) is still relevant. – orlp Mar 16 '11 at 12:46
  • If you are as new to C as you claim, shouldn't you think twice about answering at all? – Olof Forshell Mar 19 '11 at 19:17