My target: Ignore all comments when grep in java file
Say I have file Java "test.java"
/*
* multiple
* line
* comment
* range
*/
line 1;
line 2; // cmt line 2 日本語 abcd
line 3; // cmt line 3
// cmt line 4
My output file should be like this:
line 1;
line 2;
line 3;
I have to work on multiple-line regex, so I cant use normal grep.
Actually, I have tried 2 methods:
- pcre2grep -v
REGEX_IS_COMMENT ='(logger\\..*$)|([/][/].*$)|((\\/\\*)(.|[\r\n])+?(\\*\\/))'
pcre2grep -MnvH "$REGEX_IS_COMMENT" $input> $output
=> Bug: -v filter "lines" that do not match regex, so all "line 2" and "line 3" would not appear in output file
- awk and delete matched patterns
REGEX_IS_COMMENT ='(logger\\..*$)|([/][/].*$)|((\\/\\*)(.|[\r\n])+?(\\*\\/))'
awk 'BEGIN{RS=SUBSEP;} {print gensub(REGEX_IS_COMMENT,"", "g", $0)}' REGEX_IS_COMMENT=$REGEX_IS_COMMENT $input> $output
=> Bug: The dot (.) do not match Japanese characters. My output file was:
line 1;
line 2; 日本語 abcd
line 3;
Please share some of your solutions. Thank you !