1

enter image description here

I am trying to read the column total_time and sum up the times where the user is the same meaning 'name 1' = 03:30 + 04:00 and so on, I am trying to use pandas and I have manage to get the data separated, but when I try to take the total_time, as this is an object, I can not just sum it up, so I have tried to change the format of this object to a datetime object, but I haven't been able to do it.

format_time="%H:%M"

user1 = df.loc[df['user']== "name 1"]

user1['hora']=datetime.strptime((user1['total_time']),format_time)

I have tried this and it gives me an error

AttributeError: module 'datetime' has no attribute 'strptime'
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158

2 Answers2

2

The attribute error shows that you most likely used the following import:

import datetime

Try this instead:

from datetime import datetime
DapperDuck
  • 2,728
  • 1
  • 9
  • 21
  • 1
    This is not a good answer, because while it addresses the specified error, it does not address the fact that what the OP is doing is incorrect and will result in an error, even with the correct namespace for the function. – Trenton McKinney Jan 23 '21 at 00:41
1
  • Aside from the fact that the .strptime is being accessed from the incorrect namespace, it should be datetime.datetime.strptime, the implementation will result in an error (TypeError: strptime() argument 1 must be str, not Series), even with the correct function call.
    • pandas already has built-in vectorized datetime features.
  • In this case, the hours and minutes in the 'total_time column, represent an absolute amount of time, so the column should be converted with pandas.to_timedelta.
    • This function takes a string in the form 'hh:mm:ss', so df.tt + ':00' is used to get the 'total_time' ('tt') column into the correct format.
import pandas as pd

# test dataframe
df = pd.DataFrame({'user': ['n1', 'n2', 'n3', '', 'n1', 'n2'],
                   'ts': ['09:30', '09:30', '09:30', '', '14:00', '10:00'],
                   'te': ['14:00', '10:00', '12:00', '', '18:00', '18:00'],
                   'ls': ['12:00', '', '', '', '', '13:00'],
                   'le': ['13:00', '', '', '', '', '14:00'],
                   'tt': ['03:00', '00:30', '02:30', '00:00', '04:00', '07:00']})

# display(df)
  user     ts     te     ls     le     tt
0   n1  09:30  14:00  12:00  13:00  03:00
1   n2  09:30  10:00                00:30
2   n3  09:30  12:00                02:30
3                                   00:00
4   n1  14:00  18:00                04:00
5   n2  10:00  18:00  13:00  14:00  07:00
# fill blanks with nan
df = df.replace('', np.nan)

# drop all rows that are all nan, except total_time
df = df.dropna(axis=0, how='all', subset=df.columns[:-2])

# display(df)
  user     ts     te     ls     le     tt
0   n1  09:30  14:00  12:00  13:00  03:00
1   n2  09:30  10:00    NaN    NaN  00:30
2   n3  09:30  12:00    NaN    NaN  02:30
4   n1  14:00  18:00    NaN    NaN  04:00
5   n2  10:00  18:00  13:00  14:00  07:00
# convert total_time (tt) to a timedelta
df.tt = pd.to_timedelta(df.tt + ':00')

# groupby user and sum the tt column
total_hours = df.groupby('user').tt.sum().reset_index(name='total_hours')

# display(total_hours)
  user     total_hours
0   n1 0 days 07:00:00
1   n2 0 days 07:30:00
2   n3 0 days 02:30:00

# to have total_hours as a float
total_hours.total_hours = total_hours.total_hours / pd.to_timedelta(1, 'h')

# display(total_hours)
  user  total_hours
0   n1          7.0
1   n2          7.5
2   n3          2.5
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
  • thank you for you answer. this worked as asked, I have another question, I know I could use the float numbers to get how many days a user worked through the week, but is it possible to have this without converting the time to float, I mean where the 0 days appear, if I working day is 7:30 hours – juan gomez Jan 24 '21 at 16:52
  • @juangomez not easily. Here is an [answer](https://stackoverflow.com/a/51102096/7758804), but as you can see it is not simple. – Trenton McKinney Jan 24 '21 at 16:57
  • 1
    I see, I think I can manage it then with the float, thank you! you saved me – juan gomez Jan 24 '21 at 17:09