0

I have a huge list of about 12000 elements, separated by ) each 20 elements.

An example of the first three:

(((76, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261, 315, 329), 76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261), 76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 221, 232, 233, 242, 244, 248, 251), 76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 221, 229, 232, 233, 242, 244, 248),

How do I get another list with 600 sublists corresponding to each of the 20 agrupations?

roybatty
  • 45
  • 4
  • 2
    You might be looking for [How do you split a list into evenly sized chunks?](https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks) – Moinuddin Quadri Dec 28 '20 at 21:11
  • 1
    Can yoi show how the expacted output would look like for the specific input? – adir abargil Dec 28 '20 at 21:11
  • Does this answer your question? [How do you split a list into evenly sized chunks?](https://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks) – Jack Moody Dec 28 '20 at 21:12
  • @MoinuddinQuadri This isnt a normal list.. its awefully deeply nested one... – adir abargil Dec 28 '20 at 21:12
  • @adirabargil each nested sublist of 20 element needs to be treated as a single element of outer list which needs to be further divided in chunks of 600 elements (my understanding of OPs question). so the link I shared above is still applicable to what OP is trying to achieve. – Moinuddin Quadri Dec 28 '20 at 21:14
  • Please provide the expected output. You can [edit] the question. Also, are you sure it's a list? The data you posted here is nested tuples. BTW, welcome to SO! Check out the [tour] if you haven't already, and [ask] if you want tips. – wjandrea Dec 28 '20 at 21:20
  • @roybatty, I think your question is difficult to understand. Besides the fact that you are using parentheses and not brackets, which would form a tuple rather than a list in Python, your number of parentheses is not balanced. You open three parentheses, but you close four. Also, your tuple seems to start as a recursive data structure. This is fine, but you need to specify its exact format. Your example starts with three parentheses. Does that imply that your actual data would start with six hundred parentheses? – nilo Dec 28 '20 at 21:44

3 Answers3

3

We have a recursively nested iterable, and it could be huge.

In [1]: (((1, 2, 3, 4), 5, 6, 7, 8), 9, 10, 11, 12)
Out[1]: (((1, 2, 3, 4), 5, 6, 7, 8), 9, 10, 11, 12)

Since the data structure is recursive, a recursive function would solve this elegantly, but would also consume an arbitrary amount of stack, which is rude.

Such a function would however be a classical example of tail-recursive function, i.e. one whose tail call is easy to eliminate manually. In fact, if we let the Python generator mechanism take care of the intermediary results for us, the resulting function is almost as elegant as the recursive one, while requiring very little RAM (stack or otherwise). That generator function could also be used for other purposes than creating the target list, and we do not have to restrict it to tuples.

def unpack_recursive_iterable(data): 
    while True: 
        head, *tail = data if data else (None,) 
        try: 
            # Can we go deeper?
            len(head) 
        except TypeError: 
            # Deepest level 
            yield list(data) 
            return 
        yield tail
        data = head

Now, assuming we want a list of lists, in the order of the original iterable, we can create an additional adapter function:

def list_from_recursive_iterable(data): 
    unpacked = list(unpack_recursive_iterable(data)) 
    unpacked.reverse() 
    return unpacked

Validation tests (the solution works for any kind of iterables, and sub-parts may be empty):

In [4]: list_from_recursive_iterable((((1, 2, 3, 4), 5, 6, 7, 8), 9, 10, 11, 12))
Out[4]: [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]]

In [5]: list_from_recursive_iterable(())
Out[5]: []

In [6]: list_from_recursive_iterable(((((1,), 2), 3), 4))
Out[6]: [[1], [2], [3], [4]]

In [7]: list_from_recursive_iterable((((),),))
Out[7]: [[], [], []]

In [8]: list_from_recursive_iterable(((1,),))
Out[8]: [[1], []]

In [9]: list_from_recursive_iterable({1})
Out[9]: [[1]]

In [10]: list_from_recursive_iterable({1:2})
Out[10]: [[1]]

In [11]: list_from_recursive_iterable({1,2})
Out[11]: [[1, 2]]

In [12]: list_from_recursive_iterable([[1],2])
Out[12]: [[1], [2]]

It should be noted that this solution does in fact only fulfill the OP's requirement of "evenly distributed chunks" if the input itself is "evenly distributed", i.e. if every group of scalars found in the input data is equally-sized. But that requirement is fulfilled in the OP's input data.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
nilo
  • 818
  • 8
  • 20
1

Maybe i'm reading the question wrong, but you could probably do some string manipulation:

a = ((((76, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248,
       251, 261, 315, 329), 76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217,
    232, 233, 242, 244, 248, 251, 261), 76, 151, 152, 158, 185, 193, 193, 200, 208, 211,
    214, 217, 221, 232, 233, 242, 244, 248, 251), 76, 151, 152, 158, 185, 193, 193, 200,
    208, 211, 214, 217, 221, 229, 232, 233, 24


# Convert to string
astr = f"{a}"

# Output list
lines = []

# Iterate over split lines
for line in astr.split("),"):

    # Check to see if left parenthesis starts a line
    if line.startswith("("):
        # Subindex the line by the number of left parenthesis
        line = line[line.count("("):]
    # Remove trailing parenthesis
    if line.endswith(")"):
        line = line[:-1]
    # Append the line to the lines list, stripping any white space away.
    lines.append(line.strip())

Output:

['76, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261, 315, 329',
 '76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261',
 '76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 221, 232, 233, 242, 244, 248, 251',
 '76, 151, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 221, 229, 232, 233, 242, 244, 248']
Mark Moretto
  • 2,344
  • 2
  • 15
  • 21
0

You can iteratively 'unpack' the tuple into 2 parts, the second part being the 20 (or N) variables of the subset.

data = (((76, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261, 315, 329), 76, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261, 315, 329), 76, 152, 158, 185, 193, 193, 200, 208, 211, 214, 217, 232, 233, 242, 244, 248, 251, 261, 315, 329)

result = []
if data and not isinstance(data[0], tuple):
    result.append(data)
else:
    part_1, *part_2 = data
    result.append(part_2)
    while part_1 and isinstance(part_1[0], tuple):
        part_1, *part_2 = part_1
        result.append(part_2)
    result.append(list(part_1))

print(result)
gordon so
  • 190
  • 1
  • 7
  • Nice post. I think I see a trailing 76 if I run it on my machine. Is that the same for you? – Mark Moretto Dec 28 '20 at 21:29
  • 1
    Good catch; the shame on me for not doing Tdd; updated the solution with if statements to catch the initial case and last parsed iteration. – gordon so Dec 28 '20 at 21:46