5

I want to remove all comments in a toy.c file. From Remove comments from C/C++ code I see that I could use

gcc -E -fpreprocessed -P -dD toy.c

But some of my code (say deprecated functions that I don't want to compile) are wrapped up between #if 0 and endif, as if they were commented out.

  • One one hand, the above command does not remove this type of "comment" because its removal is only possible during macro expansion, which -fpreprocessed prevents;
  • On the other hand, I have other macros I don't want to expand, so dropping -fpreprocessed is a bad idea.

I see a dilemma here. Is there a way out of this situation? Thanks.


The following toy example "toy.c" is sufficient to illustrate the problem.

#define foo 3  /* this is a macro */

// a toy function
int main (void) {
  return foo;
  }

// this is deprecated
#if 0
int main (void) {
  printf("%d\n", foo);
  return 0;
  }
#endif

gcc -E -fpreprocessed -P -dD toy.c gives

#define foo 3
int main (void) {
  return foo;
  }
#if 0
int main (void) {
  printf("%d\n", foo);
  return 0;
  }
#endif

while gcc -E -P toy.c gives

int main (void) {
  return 3;
  }
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • 3
    Use a source-control system (like Git, Subversion or other) and delete the old code from the source. If you find you might need it later, it's still available in the source control system, but doesn't clutter up your current code. – Some programmer dude Sep 09 '18 at 19:51

3 Answers3

4

There's a pair of programs, sunifdef ("Son of unifdef", which is available from unifdef) and coan, that can be used to do what you want. The question Is there a C pre-processor which eliminates #ifdef blocks based on values defined/undefined? has answers which discuss these programs.

For example, given "xyz37.c":

#define foo 3  /* this is a macro */

// a toy function
int main (void) {
  return foo;
  }

// this is deprecated
#if 0
int main (void) {
  printf("%d\n", foo);
  }
#endif

Using sunifdef

sunifdef -DDEFINED -ned < xyz37.c

gives

#define foo 3  /* this is a macro */

// a toy function
int main (void) {
  return foo;
  }

// this is deprecated

and given this file "xyz23.c":

#if 0
This is deleted
#else
This is not deleted
#endif

#if 0
Deleted
#endif

#if defined(XYZ)
XYZ is defined
#else
XYZ is not defined
#endif

#if 1
This is persistent
#else
This is inconsistent
#endif

The program

sunifdef -DDEFINE -ned < xyz23.c

gives

This is not deleted

#if defined(XYZ)
XYZ is defined
#else
XYZ is not defined
#endif

This is persistent

This is, I think, what you're after. The -DDEFINED options seems to be necessary; choose any name that you do not use in your code. You could use -UNEVER_DEFINE_THIS instead, if you prefer. The -ned option evaluates the constant terms and eliminates the relevant code. Without it, the constant terms like 0 and 1 are not eliminated.

I've used sunifdef happily for a number of years (encroaching on a decade). I've not yet found it to make a mistake, and I've used it to clean up some revoltingly abstruse sets of 'ifdeffery'. The program coan is a development of sunifdef with even more capabilities.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
2

The preprocessor doesn't make exceptions. You cannot use it here to do that.

A simple state machine using python can work. It even handles nesting (well, maybe not all cases are covered like nested #if 0 but you can compare the source before & after and manually validate). Also commented code isn't supported (but it seems that you have it covered)

the input (slightly more complex than yours for the demo):

#define foo 3
int main (void) {
  return foo;
  }
#if 0
int main (void) {
  #ifdef DDD
  printf("%d\n", foo);
  #endif
  }
#endif

void other_function()
{}

now the code, using regexes to detect #if & #endif.

import re
rif0 = re.compile("\s*#if\s+0")
rif = re.compile("\s*#(if|ifn?def)")
endif = re.compile("\s*#endif")

if_nesting = 0
if0_nesting = 0
suppress = False

with open("input.c") as fin, open("output.c","w") as fout:
    for l in fin:
        if rif.match(l):
            if_nesting += 1
            if rif0.match(l):
                suppress = True
                if0_nesting = if_nesting
        elif endif.match(l):
            if if0_nesting == if_nesting:
                suppress = False
            if_nesting -= 1
            continue  # don't write the #endif

        if not suppress:
            fout.write(l))

the output file contains:

#define foo 3
int main (void) {
  return foo;
  }

void other_function()
{}

so the nesting worked and the #if 0 part was successfully removed. Not something that sed "/#if 0/,/#endif/d can achieve.

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
0

Thanks for the other two answers.

I am now aware of unifdef and sunifdef. I am happy to know the existence of these tools, and that I am not the only one who wants to do this kind of code cleaning.

I have also written a rm_if0_endif.c (attached below) for removing an #if 0 ... #endif block which is sufficient for me. Its philosophy is based on text processing. It scans an input C script, locating #if 0 and the correct enclosing endif, so that this block can be omitted during char-to-char copying.

The text processing approach is limited, as it is designed for #if 0 ... #endif case only, but is all I need for now. A C program is not the only way for this kind of text processing. Jean-François Fabre's answer demonstrates how to do it in Python. I can also do something similar in R, using readLines, startsWith and writeLines. I chose to do it in C as I am not yet an expert in C so this task drives me to learn. Here is a demo of my rm_if0_endif.c. Note that the program can concatenate several C files and add header for each file.

original input file input.c

#define foo 3  /* this is a macro */

// a toy function
int test1 (void) {
  return foo;
  }

#if 0

#undef foo
#define foo 4

#ifdef bar
  #warning "??"
#endif

// this is deprecated
int main (void) {
  printf("%d\n", foo);
  return 0;
  }

#endif

// another toy
int test2 (void) {
  return foo;
  }

gcc pre-processing output "gcc_output.c" (taken as input for my program)

gcc -E -fpreprocessed -P -dD input.c > gcc_output.c

#define foo 3
int test1 (void) {
  return foo;
  }
#if 0
#undef foo
#define foo 4
#ifdef bar
  #warning "??"
#endif
int main (void) {
  printf("%d\n", foo);
  return 0;
  }
#endif
int test2 (void) {
  return foo;
  }

final output final_output.c from my program

rm_if0_endif.c has a utility function pattern_matching and a workhorse function rm_if0_endif:

void rm_if0_endif (char *InputFile,
                   char *OutputFile, char *WriteMode, char *OutputHeader);

The attached file below has a main function, doing

rm_if0_endif("gcc_output.c",
             "final_output.c", "w", "// this is a demo of 'rm_if0_endif.c'\n");

It produces:

// this is a demo of 'rm_if0_endif.c'
#define foo 3
int test1 (void) {
  return foo;
  }

int test2 (void) {
  return foo;
  }

Appendix: rm_if0_endif.c

#include <stdio.h>
int pattern_matching (FILE *fp, const char *pattern, int length_pattern) {
  int flag = 1;
  int i, c;
  for (i = 0; i < length_pattern; i++) {
    c = fgetc(fp);
    if (c != pattern[i]) {
      flag = 0; break;
      }
    }
  return flag;
  }
void rm_if0_endif (char *InputFile,
                   char *OutputFile, char *WriteMode, char *OutputHeader) {
  FILE *fp_r = fopen(InputFile, "r");
  FILE *fp_w = fopen(OutputFile, WriteMode);
  fpos_t pos;
  if (fp_r == NULL) perror("error when opening input file!");
  fputs(OutputHeader, fp_w);
  int c, i, a1, a2;
  int if_0_flag, if_flag, endif_flag, EOF_flag;
  const char *if_0 = "if 0";
  const char *endif = "endif";
  EOF_flag = 0;
  while (EOF_flag == 0) {
    do {
      c = fgetc(fp_r);
      while ((c != '#') && (c != EOF)) {
        fputc(c, fp_w);
        c = fgetc(fp_r);
        }
      if (c == EOF) {
        EOF_flag = 1; break;
        }
      fgetpos(fp_r, &pos);
      if_0_flag = pattern_matching(fp_r, if_0, 4);
      fsetpos(fp_r, &pos);
      if (if_0_flag == 0) fputc('#', fp_w);
      } while (if_0_flag == 0);
    if (EOF_flag == 1) break;
    a1 = 1; a2 = 0;
    do {
      c = fgetc(fp_r);
      while (c != '#') c = fgetc(fp_r);
      fgetpos(fp_r, &pos);
      if_flag = pattern_matching(fp_r, if_0, 2);
      fsetpos(fp_r, &pos);
      if (if_flag == 1) a1++;
      fgetpos(fp_r, &pos);
      endif_flag = pattern_matching(fp_r, endif, 5);
      fsetpos(fp_r, &pos);
      if (endif_flag == 1) a2++;
      } while (a1 != a2);
    for (i = 0; i < 5; i++) c = fgetc(fp_r);
    if (c == EOF) {
      EOF_flag == 1;
      }
    }
  fclose(fp_r);
  fclose(fp_w);
  }
int main (void) {
  rm_if0_endif("gcc_output.c",
               "final_output.c", "w", "// this is a demo of 'rm_if0_endif.c'\n");
  return 0;
  }
S.S. Anne
  • 15,171
  • 8
  • 38
  • 76
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248