1

I'm new in python and my English are not so good so i ll try to explain my problem with the example below.

In :ds # is my dataframe
Out :DateStarted       DateCompleted  DayStarted DayCompleted  \
1460  2017-06-12 14:03:32 2017-06-12 14:04:07  2017-06-12   2017-06-12   
14445 2017-06-13 13:39:16 2017-06-13 13:40:32  2017-06-13   2017-06-13   
14109 2017-06-21 10:25:36 2017-06-21 10:32:17  2017-06-21   2017-06-21   
16652 2017-06-27 15:44:28 2017-06-27 15:44:41  2017-06-27   2017-06-27   
30062 2017-07-05 09:49:01 2017-07-05 10:04:00  2017-07-05   2017-07-05   
22357 2017-08-31 09:06:00 2017-08-31 09:10:31  2017-08-31   2017-08-31   
39117 2017-09-08 08:43:07 2017-09-08 08:44:51  2017-09-08   2017-09-08   
41903 2017-09-15 12:54:40 2017-09-15 14:00:06  2017-09-15   2017-09-15   
74633 2017-09-27 12:41:09 2017-09-27 13:16:04  2017-09-27   2017-09-27   
69315 2017-10-23 08:25:28 2017-10-23 08:26:09  2017-10-23   2017-10-23   
87508 2017-10-30 12:19:19 2017-10-30 12:19:45  2017-10-30   2017-10-30   
86828 2017-11-03 12:20:09 2017-11-03 12:24:56  2017-11-03   2017-11-03   
89877 2017-11-06 13:52:05 2017-11-06 13:52:50  2017-11-06   2017-11-06   
94970 2017-11-07 08:09:53 2017-11-07 08:10:15  2017-11-07   2017-11-07   
94866 2017-11-28 14:38:14 2017-11-30 07:51:04  2017-11-28   2017-11-30   

       DailyTotalActiveTime      diff  
1460                    NaN      35.0  
14445                   NaN      76.0  
14109                   NaN     401.0  
16652                   NaN      13.0  
30062                   NaN     899.0  
22357                   NaN     271.0  
39117                   NaN     104.0  
41903                   NaN    3926.0  
74633                   NaN    2095.0  
69315                   NaN      41.0  
87508                   NaN      26.0  
86828                   NaN     287.0  
89877                   NaN      45.0  
94970                   NaN      22.0  
94866                   NaN  148370.0  

In the DailyTotalActiveTime column, i want to calculate how much time,
the specific days, will have in total. The diff column is in seconds.
I tried this, but i had no results:

for i in ds['diff']:
if i <= 86400:
    ds['DailyTotalActiveTime']==i
else:
    ds['DailyTotalActiveTime']==86400
    ds['DailyTotalActiveTime']+1 == i-86400

What can i do? Again, sorry for the explanation..

  • 2
    I'm a bit confused by your explanation. Can you post an example of what you want `DailyTotalActiveTime` to contain? – Stev Feb 12 '18 at 12:35
  • At the last row for example, diff=148370 seconds. Each day has 86400s. So, the DailyTotalActiveTime of 2017-11-30 must be 84600 and the rest (148370 - 86400 = 63770 ) must add to the next day (2017-11-31). I hope this will help.. – Xenophon Psichis Feb 12 '18 at 12:43
  • Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) and revise your question accordingly. These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – jezrael Feb 12 '18 at 12:44
  • Ok but 148370 seconds starting from '2017-11-28 14:38:14 should be: `2017-11-28: 33706.0s; 2017-11-29: 148370.0s; 2017-11-30: 28264.0s` In this case, I think you need separate rows for each day in your DataFrame. Do you have that? It doesn't look like it in the DataFrame you have shown. It might be best to output to a Series or a new DataFrame. – Stev Feb 12 '18 at 13:05
  • @Stev thanks for your answer. How can i separate rows for each day in my DataFrame? – Xenophon Psichis Feb 12 '18 at 13:20

2 Answers2

0

You should try with = instead of ==

Joe
  • 12,057
  • 5
  • 39
  • 55
0

To get you halfway there, you could do something like the following (I am sure there must an a more simple way but I can't see it right now):

df['datestarted'] = pd.to_datetime(df['datestarted'])
df['datecompleted'] = pd.to_datetime(df['datecompleted'])

df['daystarted'] = df['datestarted'].dt.date
df['daycompleted'] = df['datecompleted'].dt.date

df['Date'] = df['daystarted'] # This is the unqiue date per row.

for row in df.itertuples():
    if (row.daycompleted - row.daystarted) > pd.Timedelta(days=0):
        for i in range(1, (row.daycompleted - row.daystarted).days+1):
            df2 = pd.DataFrame([row]).drop('Index', axis=1)
            df2['Date'] = df2['Date'] + pd.Timedelta(days=i)
            df = df.append(df2)
Stev
  • 1,100
  • 9
  • 16