0

If have lists of natural numbers which I would like to group. I am looking for an algorithm in JavaScript but I am happy with any suggestions.

The examples series are just examples, what the dataset looks like. The resulting groups don't have to be exactly like in the examples, just to give a General idea what I try to achieve.

I have tested hierachical clustering but it does not lead to the desired grouping.

| separates groups:

  • 3 4 5 | 10 12 13 14 | 100 200 300
  • 100 110 111 112 118 | 200 300 400 | 1000 1001 1004 | 1020 1030 1032 1040
  • 1 3 4 7 8 9 | 10 15 20 30 50 60 80 | 100 200 300
  • 1 2 3 | 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 30 31 40 41 50 | 100 101 102 103 104 105 106 107 108 | 200 201 202 203 204 205 206 209 210 211 212 | 900 901 | 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 | 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 | 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125|3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223

Edit: second group 3001 changed to 300

Raros
  • 21
  • 1
  • 1
  • 7
  • These groups are your input or output? – Konrad Aug 09 '22 at 20:58
  • Sorry, Input are only the numbers without groups. I would like to find the groups/borders, similar to the | from the examples. – Raros Aug 09 '22 at 21:07
  • 1
    Is this a codewars task? – Konrad Aug 09 '22 at 21:11
  • @Raros And is different OK as long as there is reasoning for why it is different? – btilly Aug 09 '22 at 22:26
  • Why are 3 and 10 in the same group in the first bullet, but separate groups in the 3rd and 4th bullets? What makes a grouping 'good'? In your examples, why are the separators where they are? – Dave Aug 09 '22 at 22:29
  • @Dave In the third bullet you could argue to have the 10 in the first group aswell. – Raros Aug 10 '22 at 08:48

2 Answers2

0

You are placing a separator when either of two conditions is met:

  1. the number of digits increases

  2. the gap between adjacent numbers increases significantly relative to what was established early in the group.

    3 4 5 | 10 12 13 14 | 100 200 300

Here your first separator has both a big increase in gap and a new digit, as does the second.

100 110 111 112 118 | 200 300 400 | 1000 1001 1004 | 1020 1030 1032 1040

Here your first separator has a big increase in gap, the second has both a big increase in gap and a new digit, and the third has a big increase in gap.

1 3 4 7 8 9 | 10 15 20 30 50 60 80 | 100 200 3001

Here my theory fails; I'd expect a separator between 200 and 3001. Maybe there's a third rule about the minimum size of a group. If that's the case, then if we added a 3002 at the end of this and the min size is 2 then we'd have a gap between 200 and 3001.

1 2 3 | 10 11 12 13 14 15 16 17 18 19 21 22 23 24 25 26 27 28 29 30 31 40 41 50 | 100 101 102 103 104 105 106 107 108 | 200 201 202 203 204 205 206 209 210 211 212 | 900 901 | 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 | 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 | 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125|3201 3202 3203 3204 3205 3206 3207 3208 3209 3210 3211 3212 3213 3214 3215 3216 3217 3218 3219 3220 3221 3222 3223

This seems to fit the pattern.

Dave
  • 7,460
  • 3
  • 26
  • 39
-1
import itertools

nums = [1, 6, 9, 100, 102, 105, 109, 134, 139]
for k, g in itertools.groupby(nums, key=lambda n: n//10):
    print(k, list(g))


#Result:
0 [1, 6, 9]
10 [100, 102, 105, 109]
13 [134, 139]

See this for more detail.

Grouping / clustering numbers in Python

ASH
  • 20,759
  • 19
  • 87
  • 200