12

I want to replace XML tags, with a sequence of repeated characters that has the same number of characters of the tag.

For example:

<o:LastSaved>2013-01-21T21:15:00Z</o:LastSaved>

I want to replace it with:

#############2013-01-21T21:15:00Z##############

How can we use RegEx for this?

Marko
  • 71,361
  • 28
  • 124
  • 158
hmghaly
  • 1,411
  • 3
  • 29
  • 47

1 Answers1

25

re.sub accepts a function as replacement:

re.sub(pattern, repl, string, count=0, flags=0)

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

Here's an example:

In [1]: import re

In [2]: def repl(m):
   ...:     return '#' * len(m.group())
   ...: 

In [3]: re.sub(r'<[^<>]*?>', repl,
   ...:     '<o:LastSaved>2013-01-21T21:15:00Z</o:LastSaved>')
Out[3]: '#############2013-01-21T21:15:00Z##############'

The pattern I used may need some polishing, I'm not sure what's the canonical solution to matching XML tags is. But you get the idea.

Lev Levitsky
  • 63,701
  • 20
  • 147
  • 175
  • the canonical solution to matching XML tags is in [this answer](https://stackoverflow.com/a/1732454/786559) :) – Ciprian Tomoiagă Nov 27 '17 at 14:00
  • quick question what does .group() do in the above? – Hao S Jul 17 '21 at 18:48
  • @HaoS the method basically returns the matched string. Had there been any "groups" in the regex, this method would return them. See the reference here: https://docs.python.org/3/library/re.html#re.Match.group – Lev Levitsky Jul 17 '21 at 19:36