Regular expression to replace a substring within a String in python

Question

I have to replace a substring with another string in a file.

Below is the line which is present in the file.

Input: #pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */

Expected Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]]

Below is my code:

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
comment = re.search(r'([\/\*]+).*([^\*\/]+)', line)
replace = re.search(r'([\[[]).*([\]]])', comment.group())
replaceWith = replace.group()
content_new = re.sub(r'([\/\*]).*([\*\/])', '# ' + replaceWith, line)

Is there an optimal solution for the above code?

What is `content`? What is the problem with the regex? If you need to replace, why do you use `re.search`, and even twice? — Wiktor Stribiżew, Jul 27 '20 at 17:23
ive edited `content` with `line `. Is there any alternative way to approach this problem with one regex statement? — Geeky, Jul 27 '20 at 17:31
I am unable to do the search for replaceWith in a single regex statement. — Geeky, Jul 27 '20 at 17:39
`content_new = re.sub(r'/\*(.*?)\*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group(1)), line)`, see [demo](https://ideone.com/CzJdbt). — Wiktor Stribiżew, Jul 27 '20 at 17:39
Two more ways: https://regex101.com/r/RbPmfB/1/ and https://regex101.com/r/RbPmfB/2 — Wiktor Stribiżew, Jul 27 '20 at 17:48

score 2 · Accepted Answer · answered Jul 28 '20 at 19:39

You need to match comments, say, with Regex to match a C-style multiline comment, and then replace the [[...]] substring inside the matches. This approach is safest, it won't fail if there is [[ and no ]] inside the comment, and there are several such comments in the string.

The sample code snippet will look like

import re
line = r'#pragma MESSAGE "Hello World" 0000=1 /* Clarification 0001: [[Specific Clarification]] */'
content_new = re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()), line)
print(content_new)

Output: #pragma MESSAGE "Hello World" 0000=1 # [[Specific Clarification]].

Details:

re.sub(r'/\*[^*]*\*+(?:[^/*][^*]*\*+)*/', ..., line) - finds all C style comments in the string, and
lambda x: re.sub(r'.*(\[\[.*?]]).*', r'# \1', x.group()) is the replacement: x is the match data object with the comment text, it matches any text up to [[, then captures [[ and then any text up to and including ]], and then matches the rest of the comment, and replaces with #, space and the Group 1 value. See the regex demo here.

Regular expression to replace a substring within a String in python

1 Answers1