2

I am trying to split a list that I have into individual lists whenever a specific character or a group of characters occur.

eg.

Main_list = [ 'abcd 1233','cdgfh3738','hryg21','**L**','gdyrhr657','abc31637','**R**','7473hrtfgf'...]

I want to break this list and save it into a sublist whenever I encounter an 'L' or an 'R'

Desired Result:

sublist_1 = ['abcd 1233','cdgfh3738','hryg21']
sublist_2 = ['gdyrhr657','abc31637']
sublist 3 = ['7473hrtfgf'...]

Is there a built in function or a quick way to do this ?

Edit: I do not want the delimiter to be in the list

Polyhedronic
  • 63
  • 1
  • 1
  • 10
  • 2
    @Kasramvd this question is not quite a duplicate of the post you linked to. This question pertains to splitting **at** positions that meet a condition. Your link pertains to splitting **before** positions that meet a condition. – pylang Jul 13 '18 at 23:54

3 Answers3

5

Use a dictionary for a variable number of variables.

In this case, you can use itertools.groupby to efficiently separate your lists:

L = ['abcd 1233','cdgfh3738','hryg21','**L**',
     'gdyrhr657','abc31637','**R**','7473hrtfgf']

from itertools import groupby

# define separator keys
def split_condition(x):
    return x in {'**L**', '**R**'}

# define groupby object
grouper = groupby(L, key=split_condition)

# convert to dictionary via enumerate
res = dict(enumerate((list(j) for i, j in grouper if not i), 1))

print(res)

{1: ['abcd 1233', 'cdgfh3738', 'hryg21'],
 2: ['gdyrhr657', 'abc31637'],
 3: ['7473hrtfgf']}
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Hey, thanks so much. This works. I am beginner in Python. Could you please break this down for me ? without using lambda functions and other one liners ? I would want to use this logic in other parts as well. – Polyhedronic Jul 13 '18 at 16:34
  • @Polyhedronic, Sure, I've updated without lambdas. Which line / function are you finding difficult to understand? – jpp Jul 13 '18 at 16:36
  • This is better. I have not used enumerate and Lambda functions much. The reason why I asked is, I want the groupby function to eventually use values from another list to automatically segregate the mainlist. In that case, I want to be able to feed this function some form of a data structure – Polyhedronic Jul 13 '18 at 16:53
2

Consider using one of many helpful tools from a library, i.e. more_itertools.split_at:

Given

import more_itertools as mit


lst = [
    "abcd 1233", "cdgfh3738", "hryg21", "**L**",
    "gdyrhr657", "abc31637", "**R**", 
    "7473hrtfgf"
]

Code

result = list(mit.split_at(lst, pred=lambda x: set(x) & {"L", "R"}))

Demo

sublist_1, sublist_2, sublist_3 = result

sublist_1
# ['abcd 1233', 'cdgfh3738', 'hryg21']
sublist_2
# ['gdyrhr657', 'abc31637']
sublist_3
# ['7473hrtfgf']

Details

The more_itertools.split_at function splits an iterable at positions that meet a special condition. The conditional function (predicate) happens to be a lambda function, which is equivalent to and substitutable with the following regular function:

def pred(x):
    a = set(x)
    b = {"L", "R"}
    return a.intersection(b)

Whenever characters of string x intersect with L or R, the predicate returns True, and the split occurs at that position.

Install this package at the commandline via > pip install more_itertools.

pylang
  • 40,867
  • 14
  • 129
  • 121
  • It's worth noting that the [source](https://more-itertools.readthedocs.io/en/stable/_modules/more_itertools/more.html#split_at) for this method is trivial and it's not *necessary* to install the library for this one function. – jpp Jul 13 '18 at 21:51
  • 1
    Easy there. For a beginner, generators may not be trivial. – pylang Jul 13 '18 at 21:54
1

@Polyhedronic, you can also try this.

>>> import re
>>> Main_list = [ 'abcd 1233','cdgfh3738','hryg21','**L**','gdyrhr657','abc31637','**R**','7473hrtfgf']
>>>
>>> s = ','.join(Main_list)
>>> s
'abcd 1233,cdgfh3738,hryg21,**L**,gdyrhr657,abc31637,**R**,7473hrtfgf'
>>>
>>> items = re.split('\*\*R\*\*|\*\*L\*\*', s)
>>> items
['abcd 1233,cdgfh3738,hryg21,', ',gdyrhr657,abc31637,', ',7473hrtfgf']
>>>
>>> output = [[a for a in item.split(',') if a] for item in items]
>>> output
[['abcd 1233', 'cdgfh3738', 'hryg21'], ['gdyrhr657', 'abc31637'], ['7473hrtfgf']]
>>>
>>> sublist_1 = output[0]
>>> sublist_2 = output[1]
>>> sublist_3 = output[2]
>>>
>>> sublist_1
['abcd 1233', 'cdgfh3738', 'hryg21']
>>>
>>> sublist_2
['gdyrhr657', 'abc31637']
>>>
>>> sublist_3
['7473hrtfgf']
>>>
hygull
  • 8,464
  • 2
  • 43
  • 52