3

I have a dataframe with a multi index 'date' and 'time'. I would like to delete the 2 last rows of each days.

For example:

Date           Time               colA                 colB
01/01/2018    08:00               15                   'abc'
01/01/2018    09:00               16                   'abd'
01/01/2018    11:00               17                   'abe'
01/01/2018    14:00               18                   'abf'
03/01/2018    11:30               19                   'abg'
03/01/2018    18:00               20                   'abh'
03/01/2018    19:00               21                   'abi'
03/01/2018    19:20               22                   'abj'
04/01/2018    14:00               23                   'abk'
04/01/2018    16:00               24                   'abl'
04/01/2018    17:00               25                   'abm'
04/01/2018    18:00               26                   'abn'
04/01/2018    19:00               27                   'abo'

would become:

Date           Time               colA                 colB
01/01/2018    08:00               15                   'abc'
01/01/2018    09:00               16                   'abd'
03/01/2018    11:30               19                   'abg'
03/01/2018    18:00               20                   'abh'
04/01/2018    14:00               23                   'abk'
04/01/2018    16:00               24                   'abl'
04/01/2018    17:00               25                   'abm'

How can I achieve this?

ALollz
  • 57,915
  • 7
  • 66
  • 89
Sithered
  • 481
  • 7
  • 23

2 Answers2

3

Assuming the dataframe is multi index with Date and Time as index

df.groupby(level = 0, as_index = False).apply(lambda x: x.iloc[:-2])


                        colA colB
    Date        Time        
0   01/01/2018  08:00   15  'abc'
                09:00   16  'abd'
1   03/01/2018  11:30   19  'abg'
                18:00   20  'abh'
2   04/01/2018  14:00   23  'abk'
                16:00   24  'abl'
                17:00   25  'abm'
Vaishali
  • 37,545
  • 5
  • 58
  • 86
2

Using cumcount to avoid apply:

s = df.groupby(level=0).cumcount(0)
df[s>1]

                  colA   colB
Date       Time
01/01/2018 08:00    15  'abc'
           09:00    16  'abd'
03/01/2018 11:30    19  'abg'
           18:00    20  'abh'
04/01/2018 14:00    23  'abk'
           16:00    24  'abl'
           17:00    25  'abm'
user3483203
  • 50,081
  • 9
  • 65
  • 94