0

I've got a dataframe with power profiles. The dataframe shows start and endtime and consumed power during a transaction. It looks something like this:

TransactionId StartTime EndTime Power
xyza123 2018.01.01 07:07:34 2018.01.01 07:34:08 70
hjker383 2018.01.01 10:21:00 2018.01.01 11:40:08 23

My Goal is to assign a new Start- and EndTime which are set at 15 min values. Like so:

TransactionId StartTime New Starttime EndTime New EndTime Power
xyza123 2018.01.01 07:07:34 2018.01.01 07:00:00 2018.01.01 07:34:08 2018.01.01 07:30:00 70
hjker383 2018.01.01 10:21:00 2018.01.01 10:30:00 2018.01.01 11:40:08 2018.01.01 11:45:00 23

The old Timestamps can be deleted afterwards. However I don't want to aggregate them. So I guess

df.groupby(pd.Grouper(key="StartTime", freq="15min")).sum()

or

df.groupby(pd.Grouper(key="StartEndtime", freq="15min")).mean()

etc. is not an option. Another idea I had was creating a dataframe with values between 2018.01.01 00:00:00 and 2018.01.01 23:45:00. However I am not sure how to iterate true the two dataframes, to achieve my goal and if iteration true dataframes is a good idea in the first place.

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
J. Susan
  • 15
  • 5
  • 1
    In your code -- 7:07:34 is being converted to 7:00:00 (going down) but 11:40:08 is being converted to 11:45:00 (going up), even though both are greater than midpoint. Is there a logic or is this a mistake? – Shubham Periwal Apr 28 '21 at 08:12
  • It's a small mistake, since I ignored the secounds. However they should be ignored in the later code as well. So I might need to convert to `%H:%M` first – J. Susan Apr 28 '21 at 08:46

3 Answers3

1

You can use a function to convert a datetime to nearest 15 minute and then apply it to the column This function was inspired from this link:

import datetime

def convertToNearest15(tm):
    discard = datetime.timedelta(minutes=tm.minute % 15,
                             seconds=tm.second,
                             microseconds=tm.microsecond)
    tm -= discard
    if discard >= datetime.timedelta(minutes=7.5):
        tm += datetime.timedelta(minutes=15)
    return tm


df['startTime'] = pd.to_datetime(df['startTime'])
df['newStartTime'] = df['startTime'].apply(convertToNearest15)
df['endTime'] = pd.to_datetime(df['endTime'])
df['newEndTime'] = df['endTime'].apply(convertToNearest15)

Here's the result:

id     | startTime              | endTime               |  newStartTime     | newEndTime
xyza123 | 2018-01-01 07:07:34   | 2018-01-01 10:21:00   | 2018-01-01 07:15:00   | 2018-01-01 10:15:00
hjker383| 2018-01-01 07:34:08   |2018-01-01 11:40:08    |2018-01-01 07:30:00    |2018-01-01 11:45:00
Shubham Periwal
  • 2,198
  • 2
  • 8
  • 26
0

Resampling the Start and Endtime to 15 mins freq

df['StartTime'] = pd.to_datetime(df.StartTime)
df['EndTime'] = pd.to_datetime(df.EndTime)
df = df.resample('15min', on='StartTime').first().dropna().rename_axis('New Starttime').reset_index()
df = df.resample('15min', on='EndTime').first().dropna().rename_axis('New EndTime').reset_index()

Output

Please rearrange the df columns as per requirement

    New EndTime         New Starttime   TransactionId   StartTime   EndTime Power
0   2018-01-01 07:30:00 2018-01-01 07:00:00 xyza123 2018-01-01 07:07:34 2018-01-01 07:34:08 70.0
1   2018-01-01 11:30:00 2018-01-01 10:15:00 hjker383    2018-01-01 10:21:00 2018-01-01 11:40:08 23.0
Utsav
  • 5,572
  • 2
  • 29
  • 43
0
df['new_date'] = df['date'].apply(lambda x: x.replace(minute=(x.minute//15 * 15), second=0))