0

I have a large number of files with $Log expanded-keyword text at the end that needs to be deleted. I am looking to modify an existing python 2.7 script to do this but cannot get the regex working correctly.

The text to strip from the end of a file looks like this:

/*
one or more lines of ..
.. possible text
$Log: oldfile.c,v $
Revision 11.4  2000/01/20 19:01:41  userid
a bunch more text ..
.. of unknown number of lines
*/

I want to strip all of the text shown above, including the comment anchors /* and */ and everything in between.

I looked at these questions/answers and a few others:

Python re.sub non-greedy mode ..

Python non-greedy rebexes

The closest I have been able to get is with:

content = re.sub(re.compile(r'\$Log:.*', re.DOTALL), '', content)

Which of course leaves behind the opening /*.

The following deleted my whole sample test file because the file opens with a matching comment (I thought the non-greedy ? modifier would prevent this):

content = re.sub(re.compile(r'^/\*.*?\$Log:.*', re.DOTALL), '', content)

I experimented with using re.MULTILINE without success.

How can a regex be defined in Python to grab the whole $Log comment -- AND none of the previous comments in the file?

Community
  • 1
  • 1
Mutagon
  • 47
  • 1
  • 7
  • I added some clarifications to my question: made the comment-to-remove more general, noted that the comment anchors `/*` and `*/` and everything in between needs to be removed, and that earlier comments in the file need to remain untouched. – Mutagon May 02 '17 at 23:56

3 Answers3

1

You can use:

result = re.sub(r"/\*\s+\*+\s+\$Log.*?\*/", "", subject, 0, re.DOTALL)

enter image description here


Regex Demo

Python Demo

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
  • My original goal was to remove from `/*` to `*/` and everything in between. However, it might be okay to leave behind the comment anchors and any harmless text that occurs before the $Log line (like the string of asterisks shown above). – Mutagon May 02 '17 at 23:38
  • I can't open the Python Demo link. Blocked by my firewall :( But your example is close -- I get what you show above, whereas I would like the text you show in Output to be deleted too -- without other comments in the file being deleted also. I will play around with the Regex Demo. Thanks for providing that. – Mutagon May 02 '17 at 23:51
  • 1
    This fit the previous iteration of my evolving example. I modified it slightly to cover the possibility of there being multiple lines between the opening comment anchor and the $Log line: "/\*\s+(\*+\s+)*?\$Log.*?\*/" This answer is what got me there. Thanks! – Mutagon May 03 '17 at 00:21
0

It is a bit unclear what you are expecting as output. My understanding is that you are trying to extract the comment. I'm assuming that the comment appears on the 3rd line and you have to just extract the third line using regex. Regex Expression used:

(\$Log:.*[\r\n]*.*[\r\n])(.*)

After using the regex for matching, the third group will be the comment as demonstrated in the link and screenshot below. So blah blah blah can be fetched using .group(2). Adding python code below:

matches = re.search(r"(\$Log:.*[\r\n]*.*[\r\n])(.*)", content)
print matches.group(2)
// Output: blah blah blah

Regex101: Sample code for python is available here.

Python Demo

enter image description here

degant
  • 4,861
  • 1
  • 17
  • 29
  • My goal is actually to remove the entire comment from `/*` to `*/`. But like @PedroLobito 's answer, this presents possibilities to remove most of the content of the comment, which could be okay if the comment anchors can't easily be removed too. – Mutagon May 02 '17 at 23:42
0
content = re.sub(re.compile(r'\/\*\n\**\n\$Log(?:.|[\n])*\*\/', re.DOTALL), '', content)

Regex Explanation

  • Welcome to StackOverflow and thanks for your attempt to help. Please take the [tour] to get many useful hints on how to ask and answer. Your answer has an external link as only explanation. When linking externally, a short summary is expected in the answer. The main trick, i.e. the part which is actually the solution, is worth a little time for typing it and more personally helpful than the generated description. Also linking to the (useful) source of that picture would be more helpful. – Yunnosch May 02 '17 at 23:47
  • This almost works but applies to only to the poor example that I started with. I updated my example comment-to-remove to be more general. – Mutagon May 02 '17 at 23:58