How Do I Filter a Numpy Array to Have Only One Y Value per X Value

Question

I'm trying to measure and map out the resistance of some vactrols for an electronics project. I used an Arduino Ohmmeter to measure it's resistance. Sometimes the data goes out of range and the data goes a bit wonky and the extremes.

Here are three sets of data, and I want to remove the part and the right end of each plot when it goes back to the left. I'm really no sure how else to express it. Should be pretty simple but I'm really quite stuck. Thanks in advance!

vactrol curves

score 1 · Accepted Answer · answered Apr 21 '21 at 08:41

If you have a reference threshold value, you can just naively filter data using Numpy with something like data[data < threshold] with threshold set for example to 10_000. Alternatively, you can also put some NaN values if you want to keep these values (because is may not always make sense to just remove them) using data[data < threshold] = np.nan.

If you do not have a reference value, then things stat to be a bit more complex. They are fancy ways to detect efficiently such patterns but most are complex.

The simplest solution is analyse the standard deviation of your input data using a sliding window and detect outliers regarding the resulting local standard deviation. You can see how to do that here (you need to combine this with something like data[sdValues < threshold] to remove the outliers). Note however that this method is very sensitive for values near 0.

An alternative solution is to compute a Gaussian/median filter and then measure the relative difference (or another more advanced distance metric) with your input data (a bit like in a high-pass filter). You can take a look to this post to do that.

For these two methods, you need to define an arbitrary threshold. But unlike the naive method, this threshold is directly related to the data variation and not the raw data itself. It is up to you to find a good threshold regarding the data variations, the outliers and the expected final input.

Note: you might be interested in using scipy.signal (especially to compute filters).

Thought of the naive method a few moments after posting which works okay in this very simple case, but thanks very much for the methods if/when my data gets more complicated. Cheers! — John Janigan-Mills, Apr 21 '21 at 14:45

How Do I Filter a Numpy Array to Have Only One Y Value per X Value

1 Answers1