1

A follow up on question Python Regex - replace a string not located between two specific words as the answers were incomplete.

Given a string str, split according to "::", while disregarding "::" that are between "<" and ">" brackets.

Expected inputs and outputs:

input  a :: <<a :: b> c>::<a < a < b:: b> :: b> :: b> ::      a
output [a , <<a :: b> c>,<a < a < b:: b> :: b> :: b> ,      a]

input a< b <c a>>
output [a< b <c a>>]

input a:<a b>
output [a:<a b>]
Community
  • 1
  • 1
ErezO
  • 364
  • 2
  • 10
  • There's an answer in the link, what's the problem? – Maroun Apr 19 '15 at 08:03
  • `[i for i in regex.split(r'(<(?:(?R)|[^<>])*>)|\s?::\s?', s) if i]` will work. – Avinash Raj Apr 19 '15 at 08:05
  • The answer is incomplete, as a< b > (no ":" )is being split into ['a', '< b >']. I'll remove the check sign from the accepted answer. – ErezO Apr 19 '15 at 08:05
  • @ErezO which input string you used? – Avinash Raj Apr 19 '15 at 08:07
  • s = "a< b >", res = [i for i in regex.split(r'(<(?:(?R)|[^<>])*>)|\s?::\s?', s) if i], res = ['a', '< b >']. Basically, the 2nd input isn't working. – ErezO Apr 19 '15 at 08:09
  • So, what's the expected output for the above? – Avinash Raj Apr 19 '15 at 08:10
  • No "::", so no split. In both the 2nd and 3rd example, there should be no split as there's no "::". Regex-2015.3.18 if it makes a difference – ErezO Apr 19 '15 at 08:11
  • @ErezO in the previous question, you made many number of edits. So we come up with a solution according to the expected output you provided. It works for that specific input. Now you ask for not to split if there is no `:` symbol present in the input. So this question is good here. But accept a solution which closely solves your previous problem. – Avinash Raj Apr 19 '15 at 08:15
  • I agree, I'll recheck the previous question. – ErezO Apr 19 '15 at 08:18
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/75624/discussion-between-erezo-and-avinash-raj). – ErezO Apr 19 '15 at 08:18

1 Answers1

1

Just an if else condition is needed for this case. This would do splitting if there is any :: substring present inside the input string else it would return the actual input string.

>>> def csplit(s):
        if '::' in s:
            return [i for i in regex.split(r'(<(?:(?R)|[^<>])*>)|::', s) if i and i != ' ']
        else:
            return s


>>> csplit('a :: <<a :: b> c>::<a < a < b:: b> :: b> :: b> ::      a')
['a ', '<<a :: b> c>', '<a < a < b:: b> :: b> :: b>', '      a']
>>> csplit('a:<a b>')
'a:<a b>'
>>> csplit('a< b <c a>>')
'a< b <c a>>'
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274