1

I am confused about tsfresh input format. Can I give a dataframe with missing values for different ids? For example, timeseries 1 {t0: 1, t2: 4, t5: 1} and timeseries 2 {t1: 5, t2: 2}. Should I fill missing values(t1, t3 etc.) with 0? thanks in advance

flyingdutchman
  • 1,197
  • 11
  • 17
ZPB
  • 138
  • 6
  • Have you tried different strategies experimentally to see if there is a measurable difference for some of the features? – jtlz2 Aug 04 '22 at 06:53

1 Answers1

2

tsfresh does not "care" about the time entries of your data. Most of its feature calculators do not need to have fixed time intervals (e.g. the mean of a timeseries is still the same, no matter which time stamps we are talking about). So yes, technically it is possible to have different times for different ids.

That being said, some feature calculators do rely on the time stamp and having proper time intervals (e.g. Fourier transformation). However, there exist many different ways on how to fill these missing values which need a lot of domain knowledge. That is why tsfresh does not do this "automatically". However, many libraries (e.g. pandas), give many possibilities for this, e.g. using resampling methods.

nilpferd1991
  • 156
  • 5