-1

I'd like to recognize sequences in this array but all of my ideas are very inefficent:

np.array([ 27,  28,  29,  30,  31,  38,  39,  40,  43,  44,  45,  46,  57,
             58,  59,  74,  85,  87,  88,  89,  90,  95,  96,  97, 166, 182,
            183, 265, 269, 271, 272, 279, 280, 281, 282, 288, 326, 327, 328,
            356, 387, 399, 407, 408, 437, 438, 439, 453, 454, 455, 456, 457,
            480, 489, 537, 538, 673, 674, 676, 677, 682, 687, 704, 729, 730,
            732, 733, 745, 746, 747, 748],
           dtype='int64')

I expect to get array [27,28,29,30,31] as group 'A' or group '1'; 31 as group 'B' or 2; [38,39,40] as group 3 or 'C', etc.

Do you know any library that do this or any "kind of " efficent way to do it?

Tomas -
  • 91
  • 8

1 Answers1

0
#!/usr/bin/env python3

array = [ 
      27,  28,  29,  30,  31,  38,  39,  40,  43,  44,  45,  46,  57,
      58,  59,  74,  85,  87,  88,  89,  90,  95,  96,  97, 166, 182,
      183, 265, 269, 271, 272, 279, 280, 281, 282, 288, 326, 327, 328,
      356, 387, 399, 407, 408, 437, 438, 439, 453, 454, 455, 456, 457,
      480, 489, 537, 538, 673, 674, 676, 677, 682, 687, 704, 729, 730,
      732, 733, 745, 746, 747, 748
]


def groupByDistance(distance, array):
    index_key = 65 # Like 'A'
    index = 0
    ret = {}
    found = []
    array.append(min(array)-1) # To be sure that there isnt yet in

    while index < len(array)-1:
        if array[index]+distance == array[index+1]:
            found.append(array[index])
        else:
            found.append(array[index])
            ret[chr(index_key)] = found
            index_key+=1
            if index_key >= 91 and index_key <= 96:
                index_key = 97 # Go to 'a' but it can overlap...
            found = []

        index+=1

    return ret

res = groupByDistance(1, array)

for _ in res:
    print(_, " : ", res[_])

Is just an hint. Note that for a large array the ascii characters are not sufficent, use something else, like string punctuation module, instead. obviusly the returned dict are not in order but you can easly supply a function for reorder.