Python split list by multiple elements and conditions

Question

I am very new to python and getting stuck in list splitting even referring to many examples in stackoverflow. What can I do if I want to split a list with the following conditions?

Task 1. Split the list like this once the items in "wordlist" were found in "datalist" collected.

Wordlist = ["Time", "date", "place",....]

output = ["A","B"]["Time","C","D","E"]["Date",.....]

Task 2. Once a specific item was found, the list would be split following by the specific word and n items included, then continue to loop over the dlist. e.g.

word, n = no. of item followed

Time, 1

Date, 2

place, 1

....

input:

datalist = ["A","B", "N", "K" , "R", "Time", "2230" , "C" , "Date" , '12/05', "E" , "F", "R", "F", "K" ,"Place", "XXXXXX", "H", "I" , "J" ]

wordlist = ["Time", "Date", "Place"]

n = [1,2,1]

output:

newlist = [["A","B", "N", "K" , "R"] ,["Time", "2230"],[ "C"], [ "Date" , '12/05',"E"][ "F", "R", "F", "K" ], ["Place", "XXXXXX"], ["H", "I","J"] ]

This is my referred example solving task-1 partially while not for task-2: Python spliting a list based on a delimiter word

what have you tried that isn't working? also your expected inputs and outputs aren't 100% consistent with the wording of what you ask. — Aaron, Nov 08 '17 at 19:37
Sorry, the inputs and outputs are fixed just now, sorry for my typo. @aaron — Sth_paul, Nov 08 '17 at 19:39

Joe Iddon · Answer 1 · 2017-11-08T20:22:16.280

A one-line solution for task 1:

[datalist[:datalist.index(wordlist[0])]] + [datalist[datalist.index(wordlist[i]):datalist.index(wordlist[i+1])] for i in range(len(wordlist)-1)] + [datalist[datalist.index(wordlist[-1]):]]

which outputs:

[['A', 'B', 'N', 'K', 'R'], ['Time', '2230'], ['C'], ['Date', '12/05', 'E'], ['F', 'R', 'F', 'K'], ['Place', 'XXXXXX'], ['H', 'I', 'J']]

Task 2:

sol = []
i = 0
s = 0
while i < len(datalist):
    if datalist[i] in wordlist:
        cs = n[wordlist.index(datalist[i])]
        print(cs)
        sol += [datalist[s:i], datalist[i:i+cs+1]]
        i += cs
        s = i + 1
    i += 1

sol.append(datalist[s:])

which outputs:

[['A', 'B', 'N', 'K', 'R'], ['Time', '2230'], ['C'], ['Date', '12/05', 'E'], ['F', 'R', 'F', 'K'], ['Place', 'XXXXXX'], ['H', 'I', 'J']]

score 0 · Answer 2 · answered Nov 08 '17 at 19:58

 datalist = ['A', 'B', 'N', 'K', 'R', 'Time', '2230', 'C', 'Date', '12/05', 
'E', 'F', 'R', 'F', 'K', 'Place', 'XXXXXX', 'H', 'I', 'J']

Getting the words :
>>> [dl for dl in datalist if dl.isalpha() and len(dl) > 1]
['Time', 'Date', 'Place', 'XXXXXX']

Getting the characters is the same as getting the words the only difference is that the length should be equal to 1.

Getting the numbers :
>>> [dl for dl in datalist if dl.isnumeric()]
['2230']

Getting the dates:
>>> [dl for dl in datalist if '/' in dl]
['12/05']
That solution is a little dumb. For a more refined solution I suggest the use of the re module.

You can then pack the results il a list to get the result you desire.

Aaron · Accepted Answer · 2017-11-08T20:10:44.807

0

Similar to the approach in the other answer you linked, I'd direct you to using a generator for a more general purpose solution.

def split_list(wordlist, splitwords = {}):
    out = []
    worditer = iter(wordlist)
    for word in worditer:
        if word in splitwords: #potentially yield previous non-keyword list and build keyword list
            if out: #yield non-keyword list
                yield out
            out = [word] #start new list with keyword
            try:
                for _ in range(splitwords[word]): #add *n* more words after keyword
                    out.append(next(worditer))
            except StopIteration: #not enough items after keyword
                pass
            yield out #yield keyword list
            out = [] #reset accumulator
        else:
            out.append(word) #grow non-keyword list
    if out: #yield trailing non-keyword list
        yield out

datalist = ["A","B", "N", "K" , "R", "Time", "2230" , "C" , "Date" , '12/05', "E" , "F", "R", "F", "K" ,"Place", "XXXXXX", "H", "I" , "J" ]
splitwords = {"Time": 1, "Date": 2, "Place": 1}

newlist = list(split_list(datalist, splitwords))
print(newlist)

edited Nov 08 '17 at 20:10

answered Nov 08 '17 at 20:01

Aaron

10,133
1
24
40

this actually has some edge case fails... imma edit real quick – Aaron Nov 08 '17 at 20:06
see edit for edge case of starting with a keyword or keyword lists backing up to one another. – Aaron Nov 08 '17 at 20:11
Thank you so much! it's helpful to learn from your detailed markdown! @Aaron – Sth_paul Nov 08 '17 at 20:14
@Sth_paul really good code is usually at least 50% comments. If you come back to this in a year, you'll only understand what you did if you comment your code. If there are any things I did you don't understand please ask. – Aaron Nov 08 '17 at 20:15
will keep commenting as much as I can as a beginner. I guess you just give a good tip for me again, thank you. – Sth_paul Nov 08 '17 at 20:25

Python split list by multiple elements and conditions

3 Answers3