I have an ordered vector, such as:
c(2, 2.8, 2.9, 3.3, 3.5, 4.7, 5.5, 7.2, 7.3, 8.7, 8.7, 10)
I want to not only remove duplicates (which is easy with unique()
), but also to average values which are too close to each other, based on a closeness threshold.
So for the above example, if the difference between two values is, say, <= 0.4, average them. The vector should become:
c(2, 2.85, 3.4, 4.7, 5.5, 7.25, 8.7, 10)
The check should be performed by pairs of numbers, up to when there is no more averaging to do.
EDIT: pay attention to the fact that 2.9 and 3.3 should not be averaged, because 2.9 is already being averaged with 2.8 and once this has been done, it's distance with 3.3 is higher than 0.4. So the cluster 2.8, 2.9, 3.3, 3.5
ends up being 2.85, 3.4
and not 3.125
.
Is there any simple way of doing this?