3

I am trying to do the following..

I have a list of n elements. I want to split this list into 32 separate lists which contain more and more elements as we go towards the end of the original list. For example from:

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

I want to get something like this:

b = [[1],[2,3],[4,5,6,7],[8,9,10,11,12]]

I've done the following for a list containing 1024 elements:

for i in range (0, 32):
    c = a[i**2:(i+1)**2]
    b.append(c)

But I am stupidly struggling to find a reliable way to do it for other numbers like 256, 512, 2048 or for another number of lists instead of 32.

vshotarov
  • 145
  • 9
  • Why exactly `3` is doubled in your sample output? When algorithm should double items from original list and why it should not? – Łukasz Rogalski Mar 02 '16 at 18:48
  • Because Im a dummy and I messed up. Sorry.. fixing it now. It should never double. – vshotarov Mar 02 '16 at 18:49
  • This would be easy enough to accomplish if you just wanted a list of integers, but I take it that's just an example and you want the indices so this can work on arbitrary lists, right? – Two-Bit Alchemist Mar 02 '16 at 18:55
  • Yes, the lists I need to work on are floats. How would you do it for ints, though? I don't understand why that would be easier. – vshotarov Mar 02 '16 at 18:56
  • Because it obviates the need to pregenerate the list, which in turn obviates the need to generate the entire list, instead allowing us to generate only the portion we need. I wonder how far we can go with that. What is an example of your input and expected output? For example, in your sample, would the user input 32? or 1024? – Two-Bit Alchemist Mar 02 '16 at 18:59
  • I am doing spectral analysis frame by frame on a .wav file. Each frame I get 1764 samples (44100/25.0, where 25 is my FPS), I get their FFT in a 1024 list, then divide it by 2 to get the half, because it is symmetrical. Then, I want to split it into 2,4,8,16 or 32 bands for the different frequencies, because I am more interested in the lower ones and the higher ones can be averaged over longer periods. So, I will always get a 512 elements long list and I will need to split it to 2,4,8,16 or 32 bands logarithmically. – vshotarov Mar 02 '16 at 19:09
  • 1
    Neither of your examples are split logarithmically, both are arithmetic sequences. – Jared Goguen Mar 02 '16 at 19:14
  • 1
    So is your intention to get more or less "even" splits as you adjust the list length (in your case 1024) up or down? Not even in the sense that they're all the same size, but the size increases at roughly the same rate as the length of the sublists grow, for however many divisions? – Two-Bit Alchemist Mar 02 '16 at 19:17
  • Insufficient data. @vshotarov You need to clarify just what you want to end up with. As Jared said, your example is not logarithmic. Do you want approximately uniform increases in list length? Are you sure your initial list will always split evenly into your desired result? – Mike Mar 02 '16 at 19:19
  • Right, sorry if it's unclear. I get that the example is a bit misleading. Basically, I have a list with 1024 elements. I want to split it into a list containing 32 lists, where each of them contains twice more elements than the previous one. – vshotarov Mar 02 '16 at 19:22
  • Since you want sublists of sizes 1, 2, 3, ..., 32 elements, What happens if if you do not have enough elements? What happens if you have more? How do you adjust the size of the sublist in those cases? – innoSPG Mar 02 '16 at 19:23
  • I am fine with some of them containing the same amount of elements, but not less than the previous and ideally more. EDIT: What I want is something like this if that makes sense `[.][..][....][......][..........][...............]`. – vshotarov Mar 02 '16 at 19:24
  • Still something to clarify, you just stated in a comment that each sublist contains twice more elements than the previous one. Do you mean what you said? Because from the example, each contains only 1 element more than the previous in the ideal case. – innoSPG Mar 02 '16 at 19:27
  • I edited the post, so take a look at the new example. Sorry for the confusion, this is the perfect example, because, as i said, ideally it grows by multiplying by 2, but in the end we have to include the last element as well, because otherwise it's left out. – vshotarov Mar 02 '16 at 19:30

4 Answers4

2

Use an iterator, a for loop with enumerate and itertools.islice:

import itertools
def logsplit(lst):
    iterator = iter(lst)
    for n, e in enumerate(iterator):
        yield itertools.chain([e], itertools.islice(iterator, n))

Works with any number of elements. Example:

for r in logsplit(range(50)):
    print(list(r))

Output:

[0]
[1, 2]
[3, 4, 5]
[6, 7, 8, 9]
... some more ...
[36, 37, 38, 39, 40, 41, 42, 43, 44]
[45, 46, 47, 48, 49]

In fact, this is very similar to this problem, except it's using enumerate to get variable chunk sizes.

Community
  • 1
  • 1
tobias_k
  • 81,265
  • 12
  • 120
  • 179
  • In case the downvote was for the typo in the code: Fair enough, I just corrected it. Otherwise: Why the downvote? – tobias_k Mar 02 '16 at 19:33
  • Thats a cool way to do it, but again, as in the previous answer the last list contains less elements than the one before, which is not ideal. I guess I could run it on a range `n+1` and then append the last one to the one before, so I end up with `n` lists, where the last one is significantly larger than the previous, which is fine. – vshotarov Mar 02 '16 at 19:33
  • @vshotarov The last list contains fewer elements because there are fewer elements left for the last list. What would you like instead? The last few elements that are not enough to fill the next sized list just to be dropped? – tobias_k Mar 02 '16 at 19:35
  • No no, I totally understand why it happens, and that's exactly my problem. As I said in the previous comment, I can easily do this to split into 33 lists and then append the elements from 33 to 32, so I end up with 32, so it could definitely do the job and if nothing better comes up I will use this. – vshotarov Mar 02 '16 at 19:38
  • @vshotarov Okay, so just to get this straight: The only requirement is that each list has more elements than the list before that? And probably it should also be as many lists as possible, I guess? – tobias_k Mar 02 '16 at 19:40
  • It's always going to be split into 4,8,16 or 32 lists. The requirment is that each list has more or equal to the previous one, but ideally growing with a factor close to 2. So obviously not something like this `[.][..][....][.............................................................]`. – vshotarov Mar 02 '16 at 19:46
  • I tested your code with 33 elements and doing what I explained above, which gave me a very well working solution, so I am accepting your answer. Thanks! – vshotarov Mar 02 '16 at 19:50
1

Something like this should solve the problem.

for i in range (0, int(np.sqrt(2*len(a)))):
    c = a[i**2:min( (i+1)**2, len(a) )]
    b.append(c)

Not very pythonic but does what you want.

def splitList(a, n, inc):
    """
    a list to split
    n number of sublist
    inc ideal difference between the number of elements in two successive sublists
    """
    zr = len(a) # remaining number of elements to split into sublists
    st = 0 # starting index in the full list of the next sublist
    nr = n # remaining number of sublist to construct
    nc = 1 # number of elements in the next sublist
    #
    b=[]
    while (zr/nr >= nc and nr>1):
        b.append( a[st:st+nc] )
        st, zr, nr, nc = st+nc, zr-nc, nr-1, nc+inc
    #
    nc = int(zr/nr)
    for i in range(nr-1):
        b.append( a[st:st+nc] )
        st = st+nc
    #
    b.append( a[st:max(st+nc,len(a))] )
    return b

# Example of call
# b = splitList(a, 32, 2)
# to split a into 32 sublist, where each list ideally has 2 more element
# than the previous
innoSPG
  • 4,588
  • 1
  • 29
  • 42
  • I tested this, but it gives me empty lists towards the end. I ran it on a list containing 512 elements and in the output list, elements from 23 to 32 are empty arrays. – vshotarov Mar 02 '16 at 19:05
  • @vshotarov, you are right, I did not test it. There is a need to adjust the range. I will look at that few minutes later. Meanwile, you can just remove the empty elements. – innoSPG Mar 02 '16 at 19:07
  • Yes, I could, but the point is to split the list into 32 lists. I will be fine with some of them containing the same number of elements, but I can't have empty ones. – vshotarov Mar 02 '16 at 19:11
  • @vshotarov I missed the point with 32. I considered only the logarithmic adjective. – innoSPG Mar 02 '16 at 19:17
  • I understand now that the logarithimc adjective is quite misleading. – vshotarov Mar 02 '16 at 19:27
  • @vshotarov, I edited the post again to suggest a function. You might want to have a look even though you accepted an answer. From what you explain, a sublist should not have less element than the previous. The accepted answers is failling that criteria, that is why I suggested this one. – innoSPG Mar 02 '16 at 20:00
1

There's always this.

>>> def log_list(l):
    if len(l) == 0:
        return [] #If the list is empty, return an empty list

    new_l = [] #Initialise new list
    new_l.append([l[0]]) #Add first iteration to new list inside of an array

    for i in l[1:]: #For each other iteration,
        if len(new_l) == len(new_l[-1]):
            new_l.append([i]) #Create new array if previous is full
        else:
            new_l[-1].append(i) #If previous not full, add to it

    return new_l

>>> log_list([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
[[1], [2, 3], [4, 5, 6], [7, 8, 9, 10]]
GarethPW
  • 399
  • 1
  • 3
  • 11
1

This is incredibly messy, but gets the job done. Note that you're going to get some empty bins at the beginning if you're logarithmically slicing the list. Your examples give arithmetic index sequences.

from math import log, exp

def split_list(_list, divs):
    n = float(len(_list))
    log_n = log(n)
    indices = [0] + [int(exp(log_n*i/divs)) for i in range(divs)]
    unfiltered = [_list[indices[i]:indices[i+1]] for i in range(divs)] + [_list[indices[i+1]:]]
    filtered = [sublist for sublist in unfiltered if sublist]
    return [[] for _ in range(divs- len(filtered))] + filtered


print split_list(range(1024), 32)

Edit: After looking at the comments, here's an example that may fit what you want:

def split_list(_list):
    copy, output = _list[:], []
    length = 1
    while copy:
        output.append([])
        for _ in range(length):
            if len(copy) > 0:
                output[-1].append(copy.pop(0))
        length *= 2
    return output


print split_list(range(15))
# [[0], [1, 2], [3, 4, 5, 6], [7, 8, 9, 10, 11, 12, 13, 14]]

Note that this code is not efficient, but it can be used as a template for writing a better algorithm.

Jared Goguen
  • 8,772
  • 2
  • 18
  • 36
  • That's an interesting one, but from your example the first few lists are actually empty? And then on one occasion the element has less elements than the one before, which is a problem. – vshotarov Mar 02 '16 at 19:36
  • @vshotarov Check out the second example, it may be closer to what you want. – Jared Goguen Mar 02 '16 at 19:37
  • The problem with this is that it doesn't give me control over the final number of lists. Otherwise it works great! – vshotarov Mar 02 '16 at 19:41