0

I have the following string:

'Cc1cc([N+](=O)[O-])ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1'

and want to capture [N+] and [O-], that is, splitting and recovering them. I do not seem to be able to recover them by using re.split.

re.split(r'\[[^\]]*\]','Cc1cc([N+](=O)[O-])ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1')

output:
['Cc1cc(', '(=O)', ')ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1']

and I am looking for something like this:

['Cc1cc(', '[N+]','(=O)','[O-]', ')ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1']

I am aware of edits like: Splitting on regex without removing delimiters or In Python, how do I split a string and keep the separators?

Daniel
  • 471
  • 3
  • 8
  • 18

1 Answers1

1

If you apply the function re.split wrapping your function with parenthesis you get the desired output:

s = 'Cc1cc([N+](=O)[O-])ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1'

re.split('(\[[^\]]*\])',s)

output : 
['Cc1cc(', '[N+]', '(=O)', '[O-]', ')ccc1OCC(C)(O)CN1CCN(Cc2ccccc2)CC1']
Daniel
  • 471
  • 3
  • 8
  • 18