0

My target: Ignore all comments when grep in java file

Say I have file Java "test.java"

/*
 * multiple
 * line
 * comment
 * range
 */

line 1;
line 2;             // cmt line 2 日本語 abcd
line 3;             // cmt line 3
// cmt line 4

My output file should be like this:

line 1;
line 2;             
line 3; 

I have to work on multiple-line regex, so I cant use normal grep.

Actually, I have tried 2 methods:

  1. pcre2grep -v
    REGEX_IS_COMMENT ='(logger\\..*$)|([/][/].*$)|((\\/\\*)(.|[\r\n])+?(\\*\\/))'
    pcre2grep -MnvH "$REGEX_IS_COMMENT" $input> $output

=> Bug: -v filter "lines" that do not match regex, so all "line 2" and "line 3" would not appear in output file

  1. awk and delete matched patterns
REGEX_IS_COMMENT ='(logger\\..*$)|([/][/].*$)|((\\/\\*)(.|[\r\n])+?(\\*\\/))'
awk 'BEGIN{RS=SUBSEP;} {print gensub(REGEX_IS_COMMENT,"", "g", $0)}' REGEX_IS_COMMENT=$REGEX_IS_COMMENT $input> $output

=> Bug: The dot (.) do not match Japanese characters. My output file was:

     line 1;
     line 2;                日本語 abcd
     line 3;        

Please share some of your solutions. Thank you !

0 Answers0