-1

I'm new to regex.

I try to remove unused code in project like

/*
    // random unmanaged annotation
    foo = var;
    doSomething();
    multilineFunction(a,
                      b);
*/

and leave "not code" annotation

/*
     real annotation
*/

I try to find and replace with regular expresion that "inside between /* and */ contains line endwith ;" but It's doesn't work with my regex. How make that regex?

I tried inside of /* */ by (/\*)(.*\n)*?(.*\*/), and cotains line endswith ; (/\*)(.*\n)*?(.*;\n)(.*\n)(.*\*/) but this regex find last match of */ and maybe dirty.

Edit: I wanted to do this only as replacement function in IDE. I solved it now by writing python code, but I'm still curious.

kdw9502
  • 105
  • 6
  • How do you differentiate between "code" and "not code"? – Sweeper Dec 17 '19 at 05:26
  • @Sweeper I assumed it was not a code comment if there was no semicolon inside the comment. – kdw9502 Dec 17 '19 at 05:28
  • 1
    Regex is fundamentally the wrong tool for this. The following is about HTML but the fundamental reasoning is the same for any context-free language: https://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not – tripleee Dec 17 '19 at 05:28
  • @tripleee this is good but can't filter contains semicolone. – kdw9502 Dec 17 '19 at 05:38
  • 1
    I would go with something like https://github.com/eliben/pycparser instead. – tripleee Dec 17 '19 at 05:39

2 Answers2

0

As already explained in the comments, it is better to use a good parser.

For a one-time hack, you can use the following regex:

\/\*[^;]*;.*?\*\/

Test here.

Assumptions for it to work without issues: - the code is proper C and it contains semi-colons ; - the annotations do NOT contain semi-colons ;.

If the assumptions are not fulfilled, you need to do some additional hacks, or to use a proper parser, as stated initially.

virolino
  • 2,073
  • 5
  • 21
0

You could use \/\*\s+(\/\/.+)[\s\S]+\*\/

Explanation:

\/\* - match \* literally

\s+ - match one or more whitespaces (including newline character)

(\/\/.+) - match line beginning with \\, i.e. comment (but only one line) and store it in first capturing group

[\s\S]+ - match one or more of any characters (\s is whitespace and \S is non-whitespace

\*\/ - match *\ literally

Demo

Then replace it with first capturig group.

Note that it will only work with one ilne comments placed at the beginnng if a commented block

Michał Turczyn
  • 32,028
  • 14
  • 47
  • 69