-2

I have a string, which contains placeholders, surrounded by "%". I want to get a list of those placeholders. I tried this regex

m = re.search('%s(.*)%s' % ('\%', '\%'), message)

on the following string

black %brown% fox jumped over the %lazy% dog

I expect to get

['brown', 'lazy']

but instead, I get

'brown% fox jumped over the %lazy'
Alexei Masterov
  • 382
  • 2
  • 10
  • 1
    You have a greedy search though jumping right to the last `%`. Question: Does your strings hold any other `%` other than the placeholders? – JvdV Nov 24 '20 at 18:50
  • 1
    Similar to this? https://stackoverflow.com/questions/766372/python-non-greedy-regexes – xdhmoore Nov 24 '20 at 18:51
  • 2
    Why are you using string formatting at all? `'%(.*)%'` is a valid regular expression. – chepner Nov 24 '20 at 18:56

6 Answers6

0

By default, searches are made in greedy mode: it will try to find the longest matching text.

You have two solutions:

  • Perform a non-greedy search (as pointed out by JvdV and xdhmoore in the comment section), add a ? next to the *: (.*?)

  • Edit your regexp to forbid any % inside the placeholder, using [^%] instead of .:

    m = re.search('%([^%]*)%', message)
    

Note: I removed the percentage string formatting. I believed you wanted to parametrize the placeholder boundary, but now I likely share the opinion of chepner, I removed it and wrote in place plain regex.

Amessihel
  • 5,891
  • 3
  • 16
  • 40
0

This is a regular experssion to find a item which is inbetween to % sings.

'%(\w+)%'

Afterwards your should use

string ='black %brown% fox jumped over the %lazy% dog'
m = re.findall(r'%(\w+)%', string)
print(m)
mosc9575
  • 5,618
  • 2
  • 9
  • 32
  • thanks! what's the difference between (.*?) and (\w+) ? why is one better than the other? – Alexei Masterov Nov 24 '20 at 19:11
  • 1
    I can not tell you what is better, but `\w+` matches only word characters. `(.*?)` matches any symbol, even non-letters. In your example you did have only letters between the `%` signs. So I thought it would be enough. – mosc9575 Nov 24 '20 at 19:16
0

You can add ? after the modifier the get a non-greedy search -

re.findall('%s(.*?)%s' % ('\%', '\%'), message)
Tom Ron
  • 5,906
  • 3
  • 22
  • 38
0
import re
text =" black %brown% fox jumped over the %lazy% dog"
print(re.findall(r'%(.*?)%', text))
mhhabib
  • 2,975
  • 1
  • 15
  • 29
  • thank you @toRex. Even though the other answers are correct also, I decided to accept yours, because it is the shortest, thus the most elegant. – Alexei Masterov Nov 24 '20 at 19:05
0
import re
message ='black %brown% fox jumped over the %lazy% dog'
m = re.findall(r'%(.*?)%', message)
print(m)

Output:-

['brown', 'lazy']
Abhishek Rai
  • 2,159
  • 3
  • 18
  • 38
0

[Solved]: Using Regex

import re

# Store your string
my_str = 'black %brown% fox jumped over the %lazy% dog'

# Find matches
match = re.findall('%([^%]*)%', my_str)

# Print everything
print match

# Iterate
for item in match:
    print item

enter image description here

[Result]:

['brown', 'lazy']

AziMez
  • 2,014
  • 1
  • 6
  • 16