1

I have a list of lists, and for each list within a list, I want to split it into two lists such that each list's length has a max of 30, otherwise I discard the remainder that can't be fit into 30 and aren't approximately close to 30.

For example: List 1 has a length of 64 -> split it into two lists of 30, 30, and discard the remaining 4.

or List 2 has length of 41, I generate a new list of 30 and discard the 11.

or List 3 has length of 58, I generate two lists of 30 and 28.

I'm using a list splitting function I found: https://stackoverflow.com/a/1751478/2027556

right now my code is something like:

new_list = []
for list_ in cluster:
    if len(list_) < 31 and len(list_) > 24:
       new_list.append(list_)
    elif len(list_) >= 31:
       chunks_list = chunks(list_, 30)
       for item in chunks_list:
          if len(item) > 25:
             new_list.append(item)

as you can see right now I'm just making a new list and going through the old one, but I think there's a more elegant pythonic solution maybe using list comprehension?

Community
  • 1
  • 1
dl8
  • 1,270
  • 1
  • 14
  • 34

8 Answers8

2

No need to be too clever about this, you can use the step argument to range():

cluster = list(range(100))
chunk_size = 30
result = [cluster[start:start+chunk_size] 
          for start in range(0, len(cluster), chunk_size)]
# discard last chunk if too small - adjust the test as needed
if len(result[-1]) < chunk_size:
    del result[-1]

The value of result will be a list of lists:

[ [0, 1, ..., 29],
  [30, 31, ..., 59],
  [60, 61, ..., 89] ]

(That said you haven't really described what the input and output are too clearly - i.e. not given specific examples.)

millimoose
  • 39,073
  • 9
  • 82
  • 134
0

Something like the following should work:

tmp = ((x[i:i+30] for i in range(0, len(x), 30)) for x in cluster)
new_list = [x for lst in tmp for x in lst if len(x) > 25]
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
0

new_list = [[lst[i*30:(i+1)*30] for i in xrange(len(lst)/30)] for lst in cluster]

user1149913
  • 4,463
  • 1
  • 23
  • 28
0

First, I'd use the grouper recipe from the itertools docs to get the groups:

new_list = list(grouper(30, cluster))

Then filter the last group to remove the fillvalue entries, and, if the result isn't "approximately close to 30", remove it.

new_list[-1] = list(filter(None, new_list[-1]))
if len(new_list) < chunk_size:
    del result[-1]

If None is a valid element, use something else as a sentinel:

sentinel = object()
new_list = list(grouper(30, cluster, fillvalue=sentinel)
new_list[-1] = [element for element in new_list[-1] if element is not sentinel]
if len(new_list[-1]) < chunk_size:
    del result[-1]

Meanwhile, there's some talk about adding a zip_strict to itertools, which would allow a grouper recipe that returns a short final group, instead of padding it with fillvalue. If this happens in, say, 3.4, you could simplify this to something like:

new_list = list(grouper(30, cluster, strict=True))
if len(new_list[-1]) < chunk_size:
    del result[-1]

Or, of course, you could use one of the "strict grouper" implementations being bandied about on the python-ideas list, or just write your own that wraps up the grouper and filter calls above.

abarnert
  • 354,177
  • 51
  • 601
  • 671
0

If you really want a list comprehension..

new_list = [cluster[i:i+30] for i in xrange(0, len(cluster), 30) if len(cluster[i:i+30]) > 25]
GP89
  • 6,600
  • 4
  • 36
  • 64
0

You can use a generator function along with itertools.islice:

In [11]: from itertools import islice

In [12]: lis=[range(64),range(41),range(58)]

In [13]: def solve(lis):
    for x in lis:
        it=iter(x)
        q,r=divmod(len(x),30)
        if r>25:
            for _ in xrange(q+1):
               yield list(islice(it,30)) 
        else:        
            for _ in xrange(q):
                yield list(islice(it,30))
   ....:                 

In [14]: map(len,list(solve(lis))) #use just `list(solve(lis))` to get the desired answer
Out[14]: [30, 30, 30, 30, 28] # (30,30) from 64, (30) from 41, and (30,28) from 58
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
0

For exactly two lists, just write 2 slices

new_list = [cluster[:30], cluster[30:60]]
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
0

Modifying the grouper from python itertools, you can do something like:

def grouper(n, iterable, max_chunks):
    args = [iter(iterable)] * n
    chunks = []

    for zipped in zip_longest(fillvalue=None, *args):
        chunks.append([x for x in zipped if x is not None])
        if(len(chunks) == max_chunks):
            break

    return chunks

new_lists = [grouper(10,li,2) for li in list_list]

This will return a list of chunks that is your split up list.
If you want this a bit flatter, you can call it like:

new_lists = []
for li in list_list:
    new_lists.extend(grouper(10,li,2))
Serdalis
  • 10,296
  • 2
  • 38
  • 58