4

I want to match two regular expressions A and B where A and B appear as 'AB'. I want to then insert a space between A and B so that it becomes 'A B'.

For example, if A = [0-9] and B = !+, I want to do something like the following.

match = re.sub('[0-9]!+', '[0-9] !+', input_string)

But, this obviously does not work as this will replace any matches with a string '[0-9] !+'.

How do I do this in regular expressions (preferably in one line)? Or does this require several tedious steps?

Rafał Rawicki
  • 22,324
  • 5
  • 59
  • 79

2 Answers2

8

Use the groups!

match = re.sub('([0-9])(!+)', r'\1 \2', input_string);

\1 and \2 indicate the first and second parenthesised fragment. The prefix r is used to keep the \ character intact.

Kos
  • 70,399
  • 25
  • 169
  • 233
0

Suppose the input string is "I have 5G network" but you want whitespace between 5 and G i.e. whenever there are expressions like G20 or AK47, you want to separate the digit and the alphabets (I have 5 G network). In this case, you need to replace a regex expression with another regular expression. Something like this:

re.sub(r'\w\d',r'\w \d',input_string)

But this won't work as the substituting string will not retain the string caught by the first regular expression.

Solution:

It can be easily solved by accessing the groups in the regex substitution. This method will work well if you want to add something to the spotted groups.

re.sub(r"(\..*$)",r"_BACK\1","my_file.jpg") and re.sub(r'(\d+)',r'<num>\1</num>',"I have 25 cents")

You can use this method to solve your question as well by capturing two groups instead of one.

re.sub(r"([A-Z])(\d)",r"\1 \2",input_string)

Another way to do it, is by using lambda functions:

re.sub(r"(\w\d)",lambda d: d.group(0)[0]+' '+d.group(0)[1],input_string)

And another way of doing it is by using look-aheads:

re.sub(r"(?<=[A-Z])(?=\d)",r" ",input_string)

Ritwik
  • 521
  • 7
  • 17