Searching multiple sub strings with special character as marker

Question

I have a string like :

myStr = "abcd123[ 45][12] cd [67]"

I want to fetch all the sub-strings between '[' and ']' markers. I am using findall to fetch the same but all i get is everything between firsr '[' and ']' last character.

print re.findall('\[(.+)\]', myStr)

What wrong am i doing here ?

Tim Biegeleisen · Accepted Answer · 2019-02-14T10:42:07.327

3

This will probably be marked as duplicate, but the simple fix here would be to just make your dot lazy:

print re.findall('\[(.+?)\]', myStr)

[' 45', '12', '67']

Here .+? means consume everything until hitting first, or nearest, closing square bracket. Your current pattern is consuming everything until the very last closing square bracket.

Another logically identical pattern which would also work is \[([^\]+)\]:

print re.findall('\[([^\]]+)\]', myStr)

edited Feb 14 '19 at 10:42

answered Feb 14 '19 at 10:39

Tim Biegeleisen

502,043
27
286
360

can you please edit your question with a little explanation about what does the `?` change ? – ddor254 Feb 14 '19 at 10:41

Christoph Burschka · Answer 2 · 2019-02-14T10:44:37.277

1

The .+ is greedy and selects as much it can, including other [] characters.

You have two options: Make the selector non-greedy by using .+? which selects the least number of characters possible, or explicitly exclude [] from your match by using [^\[\]]+ instead of .+.

(Both of these options are about equally good in this case. Though the "non-greedy" option is preferable if your ending delimiter is a longer string instead of a single character, since the longer string is more difficult to exclude.)

edited Feb 14 '19 at 10:44

answered Feb 14 '19 at 10:41

Christoph Burschka

4,467
3
16
31

1

Actually, your second suggestion is probably "better" in the sense that it should work across almost every regex engine, whereas lazy dot may not work in every engine. – Tim Biegeleisen Feb 14 '19 at 10:44

Searching multiple sub strings with special character as marker

2 Answers2