0

Simplest way to explain will be I have this code,

Str = 'Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5], [-3, 5.5, 0, 9.5]]Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5], [-3, 9.5, 0, 13.5]]Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5], [-3, 9.5, 0, 13.5]]'
from re import findall

findall ('[^\]\]]+\]\]?', Str)

What I get is,

['Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5]',
 ', [-3, 5.5, 0, 9.5]]',
 'Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5]',
 ', [-3, 9.5, 0, 13.5]]',
 'Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5]',
 ', [-3, 9.5, 0, 13.5]]']

I assume it's taking only single ']' instead of ']]' when splitting, I want result as below,

['Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5], [-3, 5.5, 0, 9.5]]',
 'Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5], [-3, 9.5, 0, 13.5]]',
 'Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5], [-3, 9.5, 0, 13.5]]']

I have gone through the documentation but couldn't work out how to achieve this or what modification should be done in above using regex findall function, a similar technique was adopted in one of answers in In Python, how do I split a string and keep the separators?

2 Answers2

0

Since you're trying to match balanced bracket constructs, a more robust solution would be to use a regex engine that supports recursion, such as the regex module, and use the (?R) pattern to recursively match balanced pairs of brackets:

import regex

regex.findall(r'.*?\[(?>[^[\]]|(?R))*\]', Str)

Demo: https://replit.com/@blhsing/TroubledTurboLint

blhsing
  • 91,368
  • 6
  • 71
  • 106
0

An idea is to use a lazy dot .+? until ]]. The dot matches any character (besides newline). Using a non-greedy (lazy) quantifier will match as least characters for the rest of the pattern to succeed.

.+?\]\]

See this demo at regex101 (more explanation on the right side)

Be aware that [^\]\]]+ does not match substrings that are not ]]. Would be the same functioning if you removed one of the closing brackets. Read more about character classes here.

bobble bubble
  • 16,888
  • 3
  • 27
  • 46