How to downsample a signal preserving spikes?

Question

I'm analyzing a signal sampled at 200Hz for 6-8 seconds, and the important part are the spikes, that lasts 1 second at max. Think for example to an earthquake...

I have to downsample the signal by a factor 2. I tried:

from scipy import signal

signal.decimate(mysignal, 2, ftype="fir")
signal.resample_poly(mysignal, 1, 2)

I get the same result with both the functions: the signal is resampled, but the spikes, positive and negative ones, are diminished.

I wrong the function, or I have to pass a custom FIR filter?

@DrBwts: the signal has frequency 200Hz, and usually the spikes are in less then 1 second. The signal can lasts 8 seconds at max. — Marco Sulla, Nov 29 '19 at 19:51
What if you use [scipy.signal.resample](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample.html#scipy.signal.resample), give x, num, t and axis and then play with the window argument? — Patol75, Dec 09 '19 at 00:45
@Patol75: for what I know, that function is used for periodic signals, and I do not have periodic signals. — Marco Sulla, Dec 09 '19 at 01:41
@MarcoSulla Well, any signal can be made periodic if you repeat it. Additionally, the window argument sounds quite important to me as, depending on the kind of function you use, you can get significantly different results. See [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.get_window.html). — Patol75, Dec 09 '19 at 02:06
@Patol75: I'm confused. What do you mean with <>? I have to mirror the signal and downsample it? Furthermore, I do not know what kind of window can be more useful to my case. — Marco Sulla, Dec 09 '19 at 20:22
@MarcoSulla Well, that could be an idea, cannot guarantee it works, but worth trying. I do not know either but I am pretty sure there should be some literature in your field that can guide you toward which windrow to use. — Patol75, Dec 09 '19 at 21:37

Martin · Accepted Answer · 2019-12-16T08:56:21.540

Note

Downsampling will always damage the signal if you hit the limits of sampling frequency with the frequency of your signal (Nyquist-Shannon sampling theorem). In your case, your spikes are similar to very high frequency signals, therefore you also need very high sampling frequency.

(Example: You have 3 points where middle one has spike. You want to downsample it to 2 points. Where to put the spike? Nowhere, because you run out of samples.)

Nevertheless, if you really want to downsample the signal and still you want to preserve (more or less accurately) particular points (in your case spikes), you can try below attitude, which 'saves' your spikes, downsample the signal and only afterwards applies the 'saved' spikes on corresponding downsampled signal positions.

Steps to do:

1) Get spikes, or in other words,local maximums(or minimums).

example: Pandas finding local max and min

2) Downsample the signal

3) With those spikes you got from 1), replace the corresponding downsampled values

(count with the fact that your signal will be damaged. You cant downsample without losing spikes that are represented by one or two points)

EDIT

Ilustrative example

This is example how to keep the spikes. Its just example, as it is now it doesnt work for negative values

import numpy as np
import matplotlib.pyplot as plt
from collections import deque


t = np.arange(1000)/100
y = np.sin(t*2*3.14)
y[150]=5
y[655]=5
y[333]=5
y[250]=5

def downsample(factor,values):
    buffer_ = deque([],maxlen=factor)
    downsampled_values = []
    for i,value in enumerate(values):
        buffer_.appendleft(value)
        if (i-1)%factor==0:
            #Take max value out of buffer
            # or you can take higher value if their difference is too big, otherwise just average
            downsampled_values.append(max(buffer_))
    return np.array(downsampled_values)

plt.plot(downsample(10,y))
plt.show()

*<>*. So you're saying the correct answer is simply "You can't"? :) — Marco Sulla, Dec 13 '19 at 15:22
Imagine you have 3 points and the middle one has spike. Now downsample it two 2 points. Where are you gonna draw the spike? You can assign the spike to point 1 or to point 2 (which is kind of damaging the signal), or you can draw it to both points (also doesnt make sense), or you dont draw it at all and you simply accept that Nyquist-Shannon sampling theorem is real. With data, you cant sit at two chairs at the same time — Martin, Dec 13 '19 at 15:42
@MarcoSulla I edited the answer with example. Its just example how to keep the spikes. Instead of doing it as I did it, you should extract the spikes as I pointed out in 2) step — Martin, Dec 13 '19 at 18:21
Well, I think I'll accept your answer if you add at the start "You can't, the signal will be damaged. But if really want to ignore this..." or something similar :) — Marco Sulla, Dec 16 '19 at 01:18

Anteino · Answer 2 · 2019-12-09T02:27:33.547

0

You can, if your hardware supports it, sample at the highest possible frequency but only save a point when a minimum difference in amplitude or difference in time is reached. That way your actual datapoints are filtered on either one criterium. When nothing really changes in the signal you have your wanted sample rate and peaks are also still registered.

Let's assume data contains your sampling points at constant sampling rate. At the end of this algorithm, the list saved will contain all important [ timestamp, sample_point ] entries of your data:

DIVIDER = 5
THRESHOLD = 1000

saved = [ [0, data[0]] ]

for i in range(1, len(data)):
    if( (i % DIVIDER == 0) || (abs(data[i] - data[i - 1]) > THRESHOLD) ):
        saved.append([ i, data[i] ])

Instead of looking at the amplitude difference between two sampling points you could also just save all the datapoints that lie above or below a certain amplitude with minor changes in this simple piece of code.

edited Dec 09 '19 at 02:27

answered Dec 09 '19 at 01:04

Anteino

1,044
7
28

I can't change the sampling rate. – Marco Sulla Dec 09 '19 at 01:39
But then you can still apply that same technique to the acquired data. You can save every other sample to get the halved sampling rate. If you still check every intermediate value for a certain threshold you can save only those intermediate points that matter. You can do the same if you divide the sampling rate by 4, 5 or 10. – Anteino Dec 09 '19 at 01:52
I'll write a short piece of code to demonstrate later. – Anteino Dec 09 '19 at 01:55
What about aliasing? – Marco Sulla Dec 09 '19 at 04:33
What about it? Do you want to downsample or not? – Anteino Dec 09 '19 at 12:53
Downsampling is the process of reducing the sampling rate of a signal, which you say you're trying to do. Of course you will experience aliasing, but if you know the lowest frequency in your signal, you can set your sampling frequency to twice that frequency (nyquist frequency). So, if in your 200 Hz signal the lowest frequency is 10 Hz you need a sampling frequency of at least 20 Hz and you can set the **DIVIDER** variable to 10. – Anteino Dec 09 '19 at 20:27
Can you share the data? I can probably demonstrate better when I know what kind of signal we're looking at. – Anteino Dec 09 '19 at 20:28
I have to downsampling by a factor 2. I can't downsample at the rate I want. And I can't share the data, they are not mine. As I said, think about an earthquake. I have 6 seconds, with 1 second of low and high spikes. – Marco Sulla Dec 09 '19 at 20:35
1

Then you can set **DIVIDER** to 2 and run the algorithm I gave on the data you have. The **saved** list will contain the data you need, but not at a fixed sampling rate. That's why I am also saving the timestamps. You cannot have a constant sampling rate throughout your data **AND** preserve all your spikes. You have to make concessions. Having a locally higher sampling rate around the spikes seemed the best compromise to me. – Anteino Dec 09 '19 at 20:42

score 0 · Answer 3 · answered Dec 13 '19 at 09:44

If you are not too fussy about the aliasing can you just take the maximum of every second (Nth) sample.

def take(N, samples):
    it = iter(samples)
    for _ in range(len(samples)/N):
        yield max(it.next() for _ in range(N))

Tensing this:

import random
random.seed(1)

a = [random.gauss(10,3) for _ in range(100)]

for c in take(5, a):
    print c

score 0 · Answer 4 · answered Dec 16 '19 at 09:12

As @Martin pointed out, the spikes contain high-frequency components. Transform your signal into a rotating frame of reference (i.e. add/remove a base frequency). This effectively shifts your signal by half the maximum required frequency and you can sample at half rate.

How to downsample a signal preserving spikes?

4 Answers4