Classifying pattern in time series

Question

I am dealing with a repeating pattern in time series data. My goal is to classify every pattern as 1, and anything that does not follow the pattern as 0. The pattern repeats itself between every two peaks as shown below in the image.

The patterns are not necessarily fixed in sample size but stay within approximate sample size, let's say 500samples +-10%. The heights of the peaks can change. The random signal (I called it random, but basically it means not following pattern shape) can also change in value.

The data is from a sensor. Patterns are when the device is working smoothly. If the device is malfunctioning, then I will not see the patterns and will get something similar to the class 0 I have shown in the image.

What I have done so far is building a logistic regression model. Here are my steps for data preparation:

Grab data between every two consecutive peaks, resample it to a fixed size of 100 samples, scale data to [0-1]. This is class 1.
Repeated step 1 on data between valley and called it class 0.
I generated some noise, and repeated step 1 on chunk of 500 samples to build extra class 0 data.

Bottom figure shows my predictions on the test dataset. Prediction on the noise chunk is not great. I am worried in the real data I may get even more false positives. Any idea on how I can improve my predictions? Any better approach when there is no class 0 data available?

I have seen similar question here. My understanding of Hidden Markov Model is limited but I believe it's used to predict future data. My goal is to classify a sliding window of 500 sample throughout my data.

jottbe · Answer 1 · 2020-10-18T12:17:39.057

I have some proposals, that you could try out. First, I think in this field often recurrent neural networks are used (e.g. LSTMs). But I also heard that some people also work with tree based method like light gbm (I think Aileen Nielsen uses this approach).

So if you don't want to dive into neural networks, which is probably not necessary, because your signals seem to be distinguishable relative easily, you can give light gbm (or other tree ensamble methods) a chance.

If you know the maximum length of a positive sample, you can define the length of your "sliding sample-window" that becomes your input vector (so each sample in the sliding window becomes one input feature), then I would add an extra attribute with the number of samples when the last peak occured (outside/before the sample window). Then you can check in how many steps you let your window slide over the data. This also depends on the memory you have available for this. But maybe it would be wise then to skip some of the windows between a change between positive and negative, because the states might not be classifiable unambiguously.

In case memory becomes an issue, neural networks could be the better choice, because for training they do not need all training data available at once, so you can generate your input data in batches. With tree based methods this possible does not exist or only in a very limited way.

score 0 · Answer 2 · answered Oct 19 '20 at 09:22

I'm not sure of what you are trying to achieve.

If you want to characterize what is a peak or not - which is an after the facts classification - then you can use a simple rule to define peaks such as signal(t) - average(signal, t-N to t) > T, with T a certain threshold and N a number of data points to look backwards to.

This would qualify what is a peak (class 1) and what is not (class 0), hence does a classification of patterns.

If your goal is to predict that a peak is going to happen few time units before the peak (on time t), using say data from t-n1 to t-n2 as features, then logistic regression might not necessarily be the best choice.

To find the right model you have to start with visualizing the features you have from t-n1 to t-n2 for every peak(t) and see if there is any pattern you can find. And it can be anything:

was there a peak in in the n3 days before t ?
is there a trend ?
was there an outlier (transform your data into exponential)

in order to compare these patterns, think of normalizing them so that the n2-n1 data points go from 0 to 1 for example.

If you find a pattern visually then you will know what kind of model is likely to work, on which features.

If you don't then it's likely that the white noise you added will be as good. so you might not find a good prediction model.

However, your bottom graph is not so bad; you have only 2 major false positives out of >15 predictions. This hints at better feature engineering.

Classifying pattern in time series

2 Answers2