I am programming the python code for removing the comments from source code too. But I want to keep the title of source code like
//**********************************
//*author
//*Function
//**********************************
and
//example
just remove // example (if there are blank after //).
I refer to this code, the highest score.
Using regex to remove comments from source files
def remove_comments(string):
pattern = r"(\".*?\"|\'.*?\')|(/\*.*?\*/|//[^\r\n]*$)"
# first group captures quoted strings (double or single)
# second group captures comments (//single-line or /* multi-line */)
regex = re.compile(pattern, re.MULTILINE|re.DOTALL)
def _replacer(match):
# if the 2nd group (capturing comments) is not None,
# it means we have captured a non-quoted (real) comment string.
if match.group(2) is not None:
return "" # so we will return empty to remove the comment
else: # otherwise, we will return the 1st group
return match.group(1) # captured quoted-string
return regex.sub(_replacer, string)
I change a little for
pattern = r"(\".*?\"|\'.*?\')|(/\*.*?\*/|//(?!(\*|\w))[^\r\n]*$)"
It did not work for //*.
But I change * to # like
pattern = r"(\".*?\"|\'.*?\')|(/\*.*?\*/|//(?!(#|\w))[^\r\n]*$)"
//##################################
//#author
//#Function
//##################################
It work.
I just confuse what's difference between # and *? Thanks for your help.