0

I am using Python 3.7.9 and I have some kind of HTML code which includes some data from pandas table. I'd like to color specific data from pandas table and therefore I'd like to re-use the text between a string marker and replace it with some other markers (they are used in Confluence to mark the text in a specific color.)

My input text string is:

text = 'some text now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker decrease-12355decrease with important information'

The replacement strings are:

increase = '<span style=\"color: Red;\">'+val+'</span>'
decrease = '<span style=\"color: Green;\">'+val+'</span>'

and val is the information to be found between the markers.

So my expected output is:

output = some text now important information starts <span style=\"color: Green;\">-123456</span> more text not to touch next marker <span style=\"color: Red;\">7896278689</span> and more text another marker <span style="color: Green;">-12355</span> with important information

Here is what I tried:

import re

text = 'some text now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker decrease-12355decrease with important information'
found_increase = re.findall('increase(.+?)increase', text)
found_decrease = re.findall('decrease(.+?)decrease',text)
output=''
for i, val in enumerate(found_increase):
    output=text.replace('increase'+val+'increase', '<span style=\"color: Red;\">'+val+'</span>')
for i, val in enumerate(found_decrease):
    output=text.replace('decrease'+val+'decrease', '<span style=\"color: Green;\">'+val+'</span>')
print(output)

I have also tried the styles methodology which comes with pandas, but Confluence is not real HTML and therefore this approach does not work for me. In my above example I get the following output:

Some text now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker <span style="color: Green;">-12355</span> with important information

martineau
  • 119,623
  • 25
  • 170
  • 301
Lars
  • 1
  • Hi, maybe you can have a look to re.sub in the documentation: https://docs.python.org/3/library/re.html. – CyDevos Feb 13 '21 at 23:18

2 Answers2

0

I found that this code worked instead:

print(re.sub(r"decrease(.*?)decrease", r"<span style=\"color: Green;\">\1</span>", test))

What's happening here is we are replacing the pattern

"decrease(.*?)decrease" 

with

"<span style=\"color: Green;\">\1</span>"

where \1 is the content of (.*?). Notice the leading r before the string. You can read about why that's there here.

Obviously, you need to recreate this for the increase version as well.

Note that replace() will replace all occurences, it looks like your code didn't take that into account.

martineau
  • 119,623
  • 25
  • 170
  • 301
Kraigolas
  • 5,121
  • 3
  • 12
  • 37
0

The python regex engine directly supports replacement through capture groups and re.sub/re.Pattern.sub. The default is to replace all occurrences of the pattern.

https://docs.python.org/3/library/re.html#re.sub

The pattern to access the first capture group is r'\1' or '\\1' respectively

import re
text = 'some text now important information starts decrease-123456decrease more text not to touch next marker increase7896278689increase and more text another marker decrease-12355decrease with important information'
inc_replaced = re.sub('increase(.+?)increase', '<span style=\"color: Red;\">\\1</span>', text)
output = re.sub('decrease(.+?)decrease', '<span style=\"color: Green;\">\\1</span>', text)

>>> output                                                                                                                                                                                                                                
'some text now important information starts <span style="color: Green;">-123456</span> more text not to touch next marker increase7896278689increase and more text another marker <span style="color: Green;">-12355</span> with important information'

sholderbach
  • 96
  • 1
  • 5