1

This might be a simple problem but I haven't come up with a solution. Say I have an array as np.array([0,1,0,1,0,0,0,1,0,1,0,0,1]) with peaks at indexes [1,3,7,9,12]. How can I replace the indexes with [2,8,12], that is, averaging indexes close in distance, if a threshold distance between peaks is set to be greater than 2 in this example?

Please note that the binary values of the array are just for illustration, the peak value can be any real number.

Francis
  • 6,416
  • 5
  • 24
  • 32
  • Use a histogram, perhaps? – Chris Arena Mar 08 '15 at 19:38
  • Suppose you have peaks at [1, 3, 5]. Do you want [3] (average of the three peaks)? Or [2,5]? Or [1,4]? – matiasg Mar 08 '15 at 19:56
  • Sorry for missing, I'd like it to be 3. There might not be too many peaks close together and the middle one should be picked without considering the relative height of peaks for simplicity by now. – Francis Mar 09 '15 at 06:15

1 Answers1

0

You could use Raymond Hettinger's cluster function:

from __future__ import division

def cluster(data, maxgap):
    """Arrange data into groups where successive elements
       differ by no more than *maxgap*

        >>> cluster([1, 6, 9, 100, 102, 105, 109, 134, 139], maxgap=10)
        [[1, 6, 9], [100, 102, 105, 109], [134, 139]]

        >>> cluster([1, 6, 9, 99, 100, 102, 105, 134, 139, 141], maxgap=10)
        [[1, 6, 9], [99, 100, 102, 105], [134, 139, 141]]
    """
    data.sort()
    groups = [[data[0]]]
    for item in data[1:]:
        val = abs(item - groups[-1][-1])
        if val <= maxgap:
            groups[-1].append(item)
        else:
            groups.append([item])
    return groups

peaks = [1,3,7,9,12]
print([sum(arr)/len(arr) for arr in cluster(peaks, maxgap=2)])

yields

[2.0, 8.0, 12.0]
Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677