How can I make my output group all similar numbers into a specific number of groups?

Question

So I wrote this code intended to let the code group different numbers from a list together in a total of n: int groups

Edit If you do not understand what the purpose of the code is, please look in the comments, I've explained it there. Thank you:)

def calcdifference(lst: list):
    for x in lst:
        return (x -= x)
print(calcdifference(lst=[4,5,6,4,3,2,3,4,5]))

def grouping(lst: list, n: int):
    if calcdifference(x) in list == max(calcdifference(x)):
        lst.append(x)
print(grouping(lst=[4,5,6,4,3,2,3,4,5]))

n: int represents the number of groups permitted from one single list, so if n is 3, the numbers will be grouped in (x,...), (x,....) (x,...) If n = 2, the numbers will be grouped in (x,..),(x,...).

However, my code prints out all possible combinations in a list of n elements. But it doesnt group the numbers together. So what I want is: for instance if the input is

[10,12,45,47,91,98,99]

and if n = 2, the output would be

[10,12,45,47] [91,98,99]

and if n = 3, the output would be

[10,12] [45,47] [91,98,99]

What changes to my code should I make?

Note: please refrain from using built in functions or import since I want to use as less built in functions as possible

Important: the code should be able to print n >= len(lst) combinations for every list provided

I think I saw this question a few hours ago, and I guess the issue with `n` is kinda adressed here. But still, what is the logic behind? Why not, for example when `n=2`, `[10,12,45] [47,91,98,99]` is not expected? What is expected when `n` is greater than `len(lst)`? — j1-lee, Nov 08 '21 at 05:13
@j1-lee yep I tried to get the answer using a different approach but that failed as well, so I asked this again lol. n can not be bigger than the len(list) because the maximum number of groupings within the list is equal to len(list). Like for instance if there are 5 elements in a list, the maximum number of groupings is 5. — Bruffff, Nov 08 '21 at 05:34
@j1-lee so the logic is that for the list that is inputted, We must seperate every similar number in the list and group them together. I tried using calcdifference to find the differences betwen each number in the list. And where the difference is the biggest, I plan to seperate the list into n groups. If it says n = 1, then I would look for the combination with the biggest difference and seperate it there. IF n = 2, I would look for the 2 biggest differences in the list and seperate them there. If n = 3, then I would look for the top 3 biggest differences and seperate it — Bruffff, Nov 08 '21 at 05:39
Does this answer your question? [1D Number Array Clustering](https://stackoverflow.com/questions/11513484/1d-number-array-clustering) — Mateen Ulhaq, Nov 08 '21 at 06:25

j1-lee · Accepted Answer · 2021-11-08T21:39:48.410

You can try the following:

def grouping(lst, n):
    diff = enumerate((abs(x - y) for x, y in zip(lst, lst[1:])), start=1)
    cut = sorted(x[0] for x in sorted(diff, reverse=True, key=lambda x: x[1])[:n-1])
    cut = [0, *cut, len(lst)] # add 0 and last index
    return [lst[i:j] for i, j in zip(cut, cut[1:])] # return slices

lst = [10,12,45,47,91,98,99]
print(grouping(lst, 2))
print(grouping(lst, 3))
print(grouping(lst, 4))

Output:

[[10, 12, 45, 47], [91, 98, 99]]
[[10, 12], [45, 47], [91, 98, 99]]
[[10, 12], [45, 47], [91], [98, 99]]

Admittedly it is quite complicated and perhaps less pythonic. There might be a more efficient way. Anyways some explanation follows...

In the first line, diff is a (sort of) list containing tuples (i, d) such that lst has difference of d between i-1th item and ith item.

The second line is more complicated. First, sorted(diff, reverse=True, key=lambda x: x[1]) sorts those tuples according to the second element, i.e., the tuple representing highest jump comes first.

Then sorted(...)[:n-1] picks the first n-1 tuples. Those will be the n-1 cuts to be used.

Generator comprehension (x[0] for x in ...) just picks the first item of each tuples; i.e., we no longer need the differences.

And then again sorted(...) will sort those cut positions, which will make the subsequent line work.

If you are reluctant to use lambda for some reason (actually operator.itemgetter(1) is better than lambda x: x[1]), you can just make a custom function for that.

def get_1st(x):
    return x[1]

def grouping(lst, n):
    diff = enumerate((abs(x - y) for x, y in zip(lst, lst[1:])), start=1)
    cut = sorted(x[0] for x in sorted(diff, reverse=True, key=get_1st)[:n-1])
    cut = [0, *cut, len(lst)] # add 0 and last index
    return [lst[i:j] for i, j in zip(cut, cut[1:])] # return slices

Thank u for ur response. However, is it possible not to use key=lambda.. like everything else is great though. I understand everything else but key=lambda and it would really help for me to u know, write a code that I myself can understand. — Bruffff, Nov 08 '21 at 07:01

How can I make my output group all similar numbers into a specific number of groups?

1 Answers1

Linked