0

I want to split each list of lists into sublists of given length. I have a courses array which looks like this:

[['CS105', 'ENG101', 'MATH101', 'GER', 'ENG102', 'CS230', 'MATH120', 'GER', 'CS205', 'FREE', 'GER', 'CS106', 'CS215', 'CS107', 'ENG204', 'GER', 'MATH220', 'CS300', 'CS206', 'CS306', 'GER', 'FREE', 'CS312', 'CS450', 'GER', 'CS321', 'FREE', 'CS325', 'GER', 'CS322', 'MAJOR', 'CS310', 'STAT205', '', 'CS443', 'CS412', 'CS421', 'GER', 'CS444', 'FREE', 'FREE','','',''], ['CS105', 'ENG101', 'MATH101', 'GER', 'ENG102', 'CS230', 'MATH120', 'GER', 'CS205', 'FREE', 'GER', 'CS106', 'CS215', 'CS107', 'ENG204', 'GER', 'MATH220', 'CS300', 'CS206', 'CS306', 'GER', 'FREE', 'CS312', 'CS450', 'GER', 'CS321', 'FREE', 'CS325', 'GER', 'CS322', 'MAJOR', 'CS310', 'STAT205', '', 'CS443', 'CS412', 'CS421', 'GER', 'CS444', 'FREE', 'FREE','','',''],...]

I want to split each list in sublists in and them to look like this:

[[['CS105', 'ENG101', 'MATH101', 'GER'],['ENG102', 'CS230', 'MATH120', 'GER'], ['CS205', 'FREE'], ['GER'], ['CS106', 'CS215', 'CS107','ENG204', 'GER'], ['MATH220', 'CS300', 'CS206', 'CS306'], ['GER', 'FREE'], ['CS312'], ['CS450', 'GER', 'CS321', 'FREE', 'CS325'], ['GER', 'CS322', 'MAJOR', 'CS310'], ['STAT205',''], [''], ['CS443', 'CS412', 'CS421', 'GER',''], ['CS444', 'FREE', 'FREE',''],['','']]...]

what I have done till now is the following:

schedule = [4, 4, 2, 1, 5, 4, 2, 1, 5, 4, 2, 1, 5, 4, 2, 1]
        for i in courses:
             Output = [courses[x - y: x] for x, y in zip(accumulate(schedule), schedule)]
        print(Output[0])

but what is printed with Output[0] is 4 lists in a row, so as I get it it takes pairs of 4 probably. schedule is the given lengths that I want each list to be splitted. I cannot understand how I need to loop probably in order to achieve the result I need.

piggy
  • 115
  • 10
  • The sublists of your source list are identical, is it intended or have you just pasted the same list twice? Oh, and the lists are only 44 items long, but `sum(schedule)` is 47 - are 3 items missing? – Błotosmętek May 06 '20 at 13:41
  • They are courses that each student has obtained so they have same items just different order. If you see at the end of the first sublist there are for example 2 empty items while at the second one there are 3. But there are 1500 sublists in general. It is just these 2 are too similar @Błotosmętek – piggy May 06 '20 at 13:46
  • Your desired output does not match the number of elements defined by `schedule` - for example, the first number `5` in `schedule` corresponds to the three element list `['CS106', 'CS215', 'CS107']` in your desired output. Is this a typo or do you have some additional logic to your program? – jfaccioni May 06 '20 at 13:59
  • Does this answer your question? [How do you split a list into evenly sized chunks?](https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks) – kojiro May 06 '20 at 14:02
  • It is a typo thank you for noticing! I will change it now! @jfaccioni – piggy May 06 '20 at 14:03
  • @kojiro as I see from the link you sent me it is for length n, while I wanted it to be for different length each time. Maybe I did not understand the answers there but the answers provided in this post were exactly as I needed them. Thank you for your suggestion! – piggy May 06 '20 at 17:11

3 Answers3

0

Got a working piece of code for you:

new_list = []
for c in courses:
    _sum = 0
    _list = []
    for t in schedule:
        el = c[_sum:_sum + t]
        _sum += t
        _list.append(el)
    new_list.append(_list)

Probably can be done quicker but this should do the job!

Write me a message if you got any questions or it is not working properly! Hope I could help.

Noah
  • 193
  • 11
  • 1
    It works great as well! I do not know to tell you if this code is better than the answer provided by @jfaccioni but they both work and give the result needed! – piggy May 06 '20 at 17:14
0

Here's a function that does what you want:

from itertools import accumulate

def divide_into_schedule(input_list, schedule):
    if schedule[0] != 0: 
        schedule.insert(0, 0) 
    indices = list(accumulate(schedule)) 
    output_list = [list() for _ in input_list] 
    for start_index, end_index in zip(indices, indices[1:]): 
        for input_sublist, output_sublist in zip(input_list, output_list): 
            output_sublist.append(input_sublist[start_index:end_index]) 
    return output_list

The core idea here is:

  • create a list of indices to take from each sublist by accumulating the schedule list;
  • initialize an output list with the same structure as the input list;
  • zip over the list of indices and the same list of indices with an element shifted (indices[1:]), so that the values become the start and end index to select from each sublist;
  • zip over each sublist in input_list and output_list;
  • using the start/end indices, "copy-paste" the desired sub-sublist from input_list onto output_list.

Call it by passing the input list and schedule as such:

input_list = [['CS105', 'ENG101', 'MATH101', 'GER', 'ENG102', 'CS230', 'MATH120', 'GER', 'CS205', 'FREE', 'GER', 'CS106', 'CS215', 'CS107', 'ENG204', 'GER', 'MATH220', 'CS300', 'CS206', 'CS306', 'GER', 'FREE', 'CS312', 'CS450', 'GER', 'CS321', 'FREE', 'CS325', 'GER', 'CS322', 'MAJOR', 'CS310', 'STAT205', '', 'CS443', 'CS412', 'CS421', 'GER', 'CS444', 'FREE', 'FREE','','',''], ['CS105', 'ENG101', 'MATH101', 'GER', 'ENG102', 'CS230', 'MATH120', 'GER', 'CS205', 'FREE', 'GER', 'CS106', 'CS215', 'CS107', 'ENG204', 'GER', 'MATH220', 'CS300', 'CS206', 'CS306', 'GER', 'FREE', 'CS312', 'CS450', 'GER', 'CS321', 'FREE', 'CS325', 'GER', 'CS322', 'MAJOR', 'CS310', 'STAT205', '', 'CS443', 'CS412', 'CS421', 'GER', 'CS444', 'FREE', 'FREE','','',''],...]
schedule = [4, 4, 2, 1, 5, 4, 2, 1, 5, 4, 2, 1, 5, 4, 2, 1]
output_list = divide_into_schedule(input_list, schedule)

The result isn't exactly the same as your desired output due to the typos I mentioned in my comment, but I believe it does what you want.

jfaccioni
  • 7,099
  • 1
  • 9
  • 25
  • This worked really good and fast based on other things I tried! Thank you so much for the clear explanation as well!! – piggy May 06 '20 at 17:08
0

Your original idea of using itertools.accumulate was actually quite good, you just made some small mistakes:

output = [ [sublist[x - y: x] for x, y in zip(accumulate(schedule), schedule)] for sublist in courses ]
Błotosmętek
  • 12,717
  • 19
  • 29
  • i don't know if I am still doing something wrong in my code but even when I replace the specific part of code with this one it takes too much time to print result, which makes me to stop it from running. Thank you for your answer! – piggy May 06 '20 at 17:16