2

I'm trying to create a function that will split a string into items and then split those items further into subitems and do this continuously until it runs out of arguments.

For example, I want to first split the following string by commas, and then split by lines, and then split by exclamation marks, and then split by the letter b. That is four splits.

s = 'abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!'

Using the following I can get the desired result:

split1 = s.split(',')
split2 = map(lambda i:i.split("|"),split1)
split3 = map(lambda i: map(lambda subitem: subitem.split("!"),i),split2)
split4 = map(lambda i: map(lambda subitem: map(lambda subsubitem: subsubitem.split("b") ,subitem),i),split3)

Result:

[[[['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['']]]]

However, I'd like to write a function that can carry out this whole process and take in an arbitrary number of arguments. In other words, the function could carry out the above process, but only split for the exclamation mark and line or split for any number of items.

How can I make a function that does the above process but looks like this?

func(s,*args)

so that it could execute the following to accomplish the same result as above.

func(s,",","|","!","b")
Chris
  • 5,444
  • 16
  • 63
  • 119
  • 1
    OK, so... what's your question? You might want to look at http://stackoverflow.com/q/36901/3001761. Also, do you actually want the ever-deeper nesting? – jonrsharpe Nov 29 '14 at 20:19

2 Answers2

3
from string import split
def rec_split(s, *tokens):
    if tokens == ():
        return s
    else:
      return map(lambda x: rec_split(x, *tokens[1:]), split(s, tokens[0]))

For me this gives:

In [669]: s = (
    'abcde,abcde|abcde!abcde,abcde|abcde!'
    'abcde,abcde|abcde!abcde,abcde|abcde!'
    'abcde,abcde|abcde!abcde,abcde|abcde!'
    'abcde,abcde|abcde!abcde,abcde|abcde!'
)

In [670]: rec_split(s, ",", "|", "!", "b")
Out[670]: 
[[[['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['']]]]
ely
  • 74,674
  • 34
  • 147
  • 228
1

Essentially the same answer as @prpl.mnky.dshwshr, but simplified:

>>> s = ('abcde,abcde|abcde!abcde,abcde|abcde!abcde,abcde|'
...      'abcde!abcde,abcde|abcde!abcde,abcde|abcde!abcde,'
...      'abcde|abcde!abcde,abcde|abcde!abcde,abcde|abcde!')
>>> 
>>> def func(s, *args):
...     return [func(s, *args[1:]) for s in s.split(args[0])] if args else s
... 
>>> import pprint
>>> 
>>> pprint.pprint(func(s, ',', '|', '!', 'b'))
[[[['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['a', 'cde']]],
 [[['a', 'cde']], [['a', 'cde'], ['']]]]
ekhumoro
  • 115,249
  • 20
  • 229
  • 336
  • I wouldn't say that converting from `map` to a list comprehension makes it "simplified". More Pythonic, perhaps, but it's not really any simpler. Just different. – Mark Reed Nov 29 '14 at 21:53
  • nonetheless, it's one line of code, which I have a stronger preference for and i find this one easier to read/understand. I've implemented and it works! – Chris Nov 29 '14 at 22:43
  • @MarkReed. The only reason I bothered posting it, was because it eliminates the `import` and `lambda`, which is all I meant by "simplified". The other differences are purely syntactic, as you say. – ekhumoro Nov 29 '14 at 23:01
  • I'm kind of a crank about this sort of thing. [Here's a rant I wrote about it on a different question.](http://stackoverflow.com/questions/27023999/pythonistic-way-to-intersect-and-add-elements-of-lists-at-the-same-time/27030754#27030754). In this case, I do see the benefits of the comprehension. Normally, though, the "better" way to remove the lambda in my answer is to just write it as a separate helper function outside the scope of the other function, and make it not rely on a closure to access my variable `s`. I don't like it when "Pythonic" is misappropriated to mean "brief at all costs." – ely Nov 30 '14 at 13:25
  • 1
    @prpl.mnky.dshwshr. I'm largely sympathetic to your views. My usual coding style is probably somewhere between what you advocate in your "rant" and what I've presented in my answer. I don't really know what the term "Pythonic" means - or, more to the point, what some people mean by it. When assessing the differences between one coding solution and another, I would hope that my conclusions are based on pragmatism, rather than on what's currently considered "idiomatic" by self-proclaimed "pythonistas". – ekhumoro Nov 30 '14 at 18:07