1

I have a dataframe including route coordinates and timestamp per random time intervals (from 1 to 50 seconds) as shown in this dataframe sample. I am looking for a way to trim the dataset in a way to keep only time differences greater then or equal to 30 seconds.

For example if the time stamp by index is like the following:

  • [0] 2017-03-27 06:52:30
  • [1] 2017-03-27 06:52:32
  • [2] 2017-03-27 06:52:45
  • [3] 2017-03-27 06:52:59
  • [4] 2017-03-27 06:53:02
  • [5] 2017-03-27 06:53:32
  • [...] ......

Idealy I would like to keep only:

  • [0] 2017-03-27 06:52:30
  • [4] 2017-03-27 06:53:02
  • [5] 2017-03-27 06:53:32
  • [...] ......

Event a hint would be helpful!

Thank you!

oikonang
  • 51
  • 11

2 Answers2

2

Consider the dataframe df

from pandas import Timestamp

df = pd.DataFrame({
        'date': [Timestamp('2017-03-27 06:52:30'),
                 Timestamp('2017-03-27 06:52:32'),
                 Timestamp('2017-03-27 06:52:45'),
                 Timestamp('2017-03-27 06:52:59'),
                 Timestamp('2017-03-27 06:53:02'),
                 Timestamp('2017-03-27 06:53:32')]
    })

I use a generator to sift through and identify when delta time has exceeded some threshold and returns the indices.

def f(s, thresh):
    cur = None
    for i, v in s.iteritems():
        if (cur is None) or (v - cur >= thresh):
            yield i
            cur = v


df.loc[list(f(df.date, pd.to_timedelta(30, 's')))]

                 date
0 2017-03-27 06:52:30
4 2017-03-27 06:53:02
5 2017-03-27 06:53:32
piRSquared
  • 285,575
  • 57
  • 475
  • 624
0

As you have not provided the data frame, let's say your column name is time. You could do: df.time.shift(1) - df.time. This will give you a column of the differences. Now you can use the index and now the new column to filter through the time column.

This post here is not a duplicate, but may be used for the application of this shift method.

This is a big hint to how I would approach it. Hope it helps!

P.s. Do provide the full data frame for future reference, so that the code can be clearly seen and referenced in reply.

Community
  • 1
  • 1
Newskooler
  • 3,973
  • 7
  • 46
  • 84