15

I'm trying to resample this Timestamp column of this Dataframe:

  Transit.head():

      Timestamp                            Plate           Gate
  0 2013-11-01 21:02:17 4f5716dcd615f21f658229a8570483a8    65
  1 2013-11-01 16:12:39 0abba297ac142f63c604b3989d0ce980    64
  2 2013-11-01 11:06:10 faafae756ce1df66f34f80479d69411d    57

And Here's What I've Done:

  Transit.drop_duplicates(inplace=True)
  Transit.Timestamp = pd.to_datetime(Transit.Timestamp)
  Transit['Timestamp'].resample('1H').pad()

But I got This Error:

  Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

Any Suggestion Would Be Much Appreciated.

Dimi
  • 531
  • 3
  • 8
  • 20

1 Answers1

21

Create DatetimeIndex by DataFrame.set_index - solution for upsampling and downsampling:

df = Transit.set_index('Timestamp').resample('1H').pad()
print (df)
                                                Plate  Gate
Timestamp                                                  
2013-11-01 11:00:00                               NaN   NaN
2013-11-01 12:00:00  faafae756ce1df66f34f80479d69411d  57.0
2013-11-01 13:00:00  faafae756ce1df66f34f80479d69411d  57.0
2013-11-01 14:00:00  faafae756ce1df66f34f80479d69411d  57.0
2013-11-01 15:00:00  faafae756ce1df66f34f80479d69411d  57.0
2013-11-01 16:00:00  faafae756ce1df66f34f80479d69411d  57.0
2013-11-01 17:00:00  0abba297ac142f63c604b3989d0ce980  64.0
2013-11-01 18:00:00  0abba297ac142f63c604b3989d0ce980  64.0
2013-11-01 19:00:00  0abba297ac142f63c604b3989d0ce980  64.0
2013-11-01 20:00:00  0abba297ac142f63c604b3989d0ce980  64.0
2013-11-01 21:00:00  0abba297ac142f63c604b3989d0ce980  64.0

For downsampling is possible use parameter on:

df = Transit.resample('D', on='Timestamp').mean()
print (df)
            Gate
Timestamp       
2013-11-01    62

EDIT: For remove all rows with duplicated Timestamp add parameter subset to DataFrame.drop_duplicates:

Transit.drop_duplicates(subset=['Timestamp'], inplace=True)
Transit.Timestamp = pd.to_datetime(Transit.Timestamp)
df = Transit.set_index('Timestamp').resample('1H').pad()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 2
    I've Already Tried That And Thta's What I got: cannot reindex a non-unique index with a method or limit – Dimi Apr 02 '19 at 11:54
  • @Dimi - Edited answer. – jezrael Apr 02 '19 at 12:02
  • Indeed , the way I was trying to drop duplicates was wrong , that resolved my issue. – Dimi Apr 02 '19 at 12:02
  • @Dimi - If no parameter `subset`, if filter out duplicated by check all columns, by only one column need specify it. – jezrael Apr 02 '19 at 12:03
  • 1
    When I run `Transit.set_index('Timestamp').resample('D').sum()` in my case, the output is like: `2013-11-01T11:00:00.000000000` and there is no more a column `Timestamp`! – PM0087 Dec 06 '19 at 15:15
  • @PeyM87 - then use `Transit.set_index('Timestamp').resample('D').sum().rename_axis('Timestamp')` – jezrael Dec 06 '19 at 15:20
  • @jezrael: Thanks for the super quick reply! It didn't work though. But I found simply adding this line after the first line works: `Transit.reset_index(inplace=True)` – PM0087 Dec 06 '19 at 15:25