3

I'm trying to find a simple way to convert a string like this:

a = '[[a b] [c d]]'

into the corresponding nested list structure, where the letters are turned into strings:

a = [['a', 'b'], ['c', 'd']]

I tried to use

import ast
l = ast.literal_eval('[[a b] [c d]]')
l = [i.strip() for i in l]

as found here

but it doesn't work because the characters a,b,c,d are not within quotes.

in particular I'm looking for something that turns:

'[[X v] -s]'

into:

[['X', 'v'], '-s']
Community
  • 1
  • 1
DaniPaniz
  • 1,058
  • 2
  • 13
  • 24

3 Answers3

5

You can use regex to find all items between brackets then split the result :

>>> [i.split() for i in re.findall(r'\[([^\[\]]+)\]',a)]
[['a', 'b'], ['c', 'd']]

The regex r'\[([^\[\]]+)\]' will match anything between square brackets except square brackets,which in this case would be 'a b' and 'c d' then you can simply use a list comprehension to split the character.

Note that this regex just works for the cases like this, which all the characters are between brackets,and for another cases you can write the corresponding regex, also not that the regex tick won't works in all cases .

>>> a = '[[a b] [c d] [e g]]'
>>> [i.split() for i in re.findall(r'\[([^\[\]]+)\]',a)]
[['a', 'b'], ['c', 'd'], ['e', 'g']]
Mazdak
  • 105,000
  • 18
  • 159
  • 188
2

Use isalpha method of string to wrap all characters into brackets:

a = '[[a b] [c d]]'

a = ''.join(map(lambda x: '"{}"'.format(x) if x.isalpha() else x, a))

Now a is:

'[["a" "b"] ["c" "d"]]'

And you can use json.loads (as @a_guest offered):

json.loads(a.replace(' ', ','))
Eugene Soldatov
  • 9,755
  • 2
  • 35
  • 43
  • Consider using `json.loads(a.replace(' ', ','))` instead of `ast.literal_eval` cause this way it will work with triply (or higher) nested lists. – a_guest Oct 15 '15 at 12:13
  • I'm trying to work this out, e.g.: a = '[[[a b] c] [a [b c]] d]' but it gives me a error – DaniPaniz Oct 15 '15 at 12:20
  • That's why you should use `json.loads` as mentioned in my previous comment. – a_guest Oct 15 '15 at 12:34
1
>>> import json
>>> a = '[[a b] [c d]]'
>>> a = ''.join(map(lambda x: '"{}"'.format(x) if x.isalpha() else x, a))
>>> a
'[["a" "b"] ["c" "d"]]'
>>> json.loads(a.replace(' ', ','))
[[u'a', u'b'], [u'c', u'd']]

This will work with any degree of nested lists following the above pattern, e.g.

>>> a = '[[[a b] [c d]] [[e f] [g h]]]'
>>> ...
>>> json.loads(a.replace(' ', ','))
[[[u'a', u'b'], [u'c', u'd']], [[u'e', u'f'], [u'g', u'h']]]

For the specific example of '[[X v] -s]':

>>> import json
>>> a = '[[X v] -s]'
>>> a = ''.join(map(lambda x: '"{}"'.format(x) if x.isalpha() or x=='-' else x, a))
>>> json.loads(a.replace('[ [', '[[').replace('] ]', ']]').replace(' ', ',').replace('][', '],[').replace('""',''))
[[u'X', u'v'], u'-s']
a_guest
  • 34,165
  • 12
  • 64
  • 118
  • it works great but.. can we fix it so that it's less picky about empty spaces? i.e. this works ---> a = '[[[a b] [c d]] f]' but this would make it crash --> '[[[a b][c d]] f]' – DaniPaniz Oct 15 '15 at 12:58
  • Yes, just substitute `a.replace(' ', ',')` with `a.replace(' ', ',').replace('][', '],[')`. – a_guest Oct 15 '15 at 13:11
  • awesome, and I know I'm a pain but... why does this make it crash? ---> '[[[a b] [c d] ] f]' anyway to fix it so that any empty spaces in the formula wouldn't bug? – DaniPaniz Oct 15 '15 at 14:09
  • It will insert a `,` between the two brackets of `] ]`. If you have patterns like that just do `a.replace('[ [', '[[').replace('] ]', ']]').replace(' ', ',').replace('][', '],[')`. Note that this will only work with a single whitespace between two brackets, if you have more you probably need to use regular expressions. – a_guest Oct 16 '15 at 14:38
  • sorry to bother you again... is there any way to hack this so that it works with this example: '[[X v] -s]' i.e. I get as output: [['X', 'v'], '-s'] (see the last edit of the question) – DaniPaniz Oct 19 '15 at 16:32