0

Here's a list that I have,

data = (i for i in list("abcdefghijklmnopqrstuvwxyzabcedefghijklmnopqrstuvwxyz"))

Here data is a generator and I want to iterate over it and prepared batches of 12 equal datapoints, if it is less than 12 in last batch I need it too, but below code is not working,

subsets = []
subset = []
for en, i in enumerate(data):
    if en % 12 == 0 and en > 0:
        subsets.append(subset)
        subset = []
    else:
        subset.append(i)

print(subsets)

[['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'],
 ['n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x'],
 ['z', 'a', 'b', 'c', 'e', 'd', 'e', 'f', 'g', 'h', 'i'],
 ['k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u']]

But my code is not working properly because the first nested list has 12 values but rest of it have 11 values and it missed out last few values which are less than 12 in the last batch

Expected Output:

[['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'],
 ['m', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x'],
 ['y', 'z', 'a', 'b', 'c', 'e', 'd', 'e', 'f', 'g', 'h', 'i'],
 ['j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u'],
 ['v', 'w', 'x', 'y', 'z']]
user_12
  • 1,778
  • 7
  • 31
  • 72

3 Answers3

1

Two changes, you need to start iterating from 1 and append in the sublist before emptying it:

data = (i for i in list("abcdefghijklmnopqrstuvwxyzabcedefghijklmnopqrstuvwxyz"))
subsets = []
subset = []
# start counting from index '1'
for en, i in enumerate(data, 1):
    if en % 12 == 0 and en > 0:
        # append the current element before emptying 'subset'
        subset.append(i)
        subsets.append(subset)
        subset = []
    else:
        subset.append(i)
# append the left-over sublist/subset to your main list as well
subsets.append(subset)

for i in subsets:
    print(i)

gives

['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l']
['m', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x']
['y', 'z', 'a', 'b', 'c', 'e', 'd', 'e', 'f', 'g', 'h', 'i']
['j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u']
['v', 'w', 'x', 'y', 'z']
Jarvis
  • 8,494
  • 3
  • 27
  • 58
1

Alternative solution is using buit-in itertools.islice. You can check to see which approach is faster or more convenient. Kr.

import itertools

def gen_sublist(your_iter, size):
    while True:
        part = tuple(itertools.islice(your_iter, size))
        if not part:
            break
        yield part

data = (i for i in list("abcdefghijklmnopqrstuvwxyzabcedefghijklmnopqrstuvwxyz"))

for c in gen_sublist(data, size=12):
    print(c)

which returns:

('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l')
('m', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x')
('y', 'z', 'a', 'b', 'c', 'e', 'd', 'e', 'f', 'g', 'h', 'i')
('j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u')
('v', 'w', 'x', 'y', 'z')
antoine
  • 662
  • 4
  • 10
0

A different approach which does not use modulo or enumeration (just another option since other answers already correct your approach):

In [1]: subsets = []                                                                                                                 
                                                                                                                                      
In [2]: data = (i for i in list("abcdefghijklmnopqrstuvwxyzabcedefghijklmnopqrstuvwxyz"))                                            
In [3]:
    ...: while True:                                                                                                                  
    ...:     try: 
    ...:         x = [] 
    ...:         for i in range(12): 
    ...:             x.append(next(data)) 
    ...:         subsets.append(x) 
    ...:     except: # Catch StopIteration Exception when generator runs out of values
    ...:         subsets.append(x) 
    ...:         break 
    ...:                                                                                                                              

Outputs:

In [4]: subsets                                                                                                                      
Out[4]:                                                                                                                              
[['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l'],
 ['m', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x'],
 ['y', 'z', 'a', 'b', 'c', 'e', 'd', 'e', 'f', 'g', 'h', 'i'],
 ['j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u'],
 ['v', 'w', 'x', 'y', 'z']]

SajanGohil
  • 960
  • 13
  • 26