0

This is a dataset that captures the time and duration that a switch is turned on/off. The Input column is the input captured at that time, while the duration column is the duration the switch is in the on/off status.

          Time           Device  Input Status  Duration
0   16-11-2020 00:00:00   led1    off     on      0.00
1   16-11-2020 15:24:00   led1     on    off    924.20
2   16-11-2020 17:51:00   led1    off     on    147.55
3   16-11-2020 19:25:00   led1     on    off     93.70
4   17-11-2020 01:07:00   led1    off     on    341.97
5   17-11-2020 18:45:00   led1     on    off   1057.87
6   17-11-2020 20:02:00   led1    off     on     77.60
7   17-11-2020 20:34:00   led1     on    off     31.65

I can split the date and time into new columns (using the only way i know how)

for i in range(len(df)):
    df['Date'] = [i for i in range(len(df))]
    df['Hour'] = [i for i in range(len(df))]
    
for i in range(len(df)):
    df['Date'][i] = df['Time'][i][:10]
    df['Hour'][i] = df['Time'][i][11:]

and the dataframe will return this

            Time       Device   InputStatusDuration Date    Hour
0   16-11-2020 00:00:00 led1    off on  0.00    16-11-2020  00:00:00
1   16-11-2020 15:24:00 led1    on  off 924.20  16-11-2020  15:24:00
2   16-11-2020 17:51:00 led1    off on  147.55  16-11-2020  17:51:00
3   16-11-2020 19:25:00 led1    on  off 93.70   16-11-2020  19:25:00
4   17-11-2020 01:07:00 led1    off on  341.97  17-11-2020  01:07:00
5   17-11-2020 18:45:00 led1    on  off 1057.87 17-11-2020  18:45:00
6   17-11-2020 20:02:00 led1    off on  77.60   17-11-2020  20:02:00
7   17-11-2020 20:34:00 led1    on  off 31.65   17-11-2020  20:34:00

Now how do I proceed to append a new column called 'Total duration', which is the sum of the duration for rows with the same dates?

  • 1
    Does this answer your question? [Pandas: sum DataFrame rows for given columns](https://stackoverflow.com/questions/25748683/pandas-sum-dataframe-rows-for-given-columns) – KaRaOkEy Nov 27 '20 at 07:46
  • Alternatively, have a look here: https://stackoverflow.com/questions/30244952/how-do-i-create-a-new-column-from-the-output-of-pandas-groupby-sum – KaRaOkEy Nov 27 '20 at 07:49

1 Answers1

3

An easier way to create the Date and Hour columns would be something like this:

df['Time'] = pd.to_datetime(df['Time'])
df['Date'] = df['Time'].apply(lambda x: x.date())
df['Hour'] = df['Time'].apply(lambda x: x.time())

In order to get the sum of durations by date, you could then afterwards do something like

print(df.groupby('Date').sum())
oskros
  • 3,101
  • 2
  • 9
  • 28