1

I have data stored in a DataFrame and I have function to manipulate each row and store it in new DataFrame format.

import pandas as pd  

def get_data(start_time):  
    from datetime import datetime, timedelta
    start_time = datetime.strptime(start_time)
    ten_second = start_time + timedelta(0,10)
    twenty_second = start_time + timedelta(0,20)
    combine = {'start' : ten_second, 'end' : twenty_second}
    rsam=pd.DataFrame(combine, index=[0])
    return(rsam)


df_event = pd.DataFrame([["2019-01-10 13:16:25"],
             ["2019-01-29 13:56:21"],
             ["2019-02-09 14:41:21"],
             ["2019-02-07 11:28:50"]])

temp=[]
for index, row in df_event.iterrows():
    temp=get_data(row[0])

I read in the internet they suggest me to use iterrows() but my looping function still get error

What I expected in temp variable

Index          ten_second            twenty_second
  0         2019-01-10 13:16:35   2019-01-10 13:16:45
  1         2019-01-29 13:56:31   2019-01-29 13:56:41
  3         2019-02-09 14:41:31   2019-02-09 14:41:41
  4         2019-02-17 11:29:00   2019-02-17 11:29:10
ALollz
  • 57,915
  • 7
  • 66
  • 89
Ghozally
  • 93
  • 1
  • 8

1 Answers1

2

You don't need iterrows or your function. Simply use pd.Timedelta:

c1 = df_event[0] + pd.Timedelta('10s')
c2 = df_event[0] + pd.Timedelta('20s')

temp = pd.DataFrame({'ten_second':c1,
                     'twenty_second':c2})

Output

           ten_second       twenty_second
0 2019-01-10 13:16:35 2019-01-10 13:16:45
1 2019-01-29 13:56:31 2019-01-29 13:56:41
2 2019-02-09 14:41:31 2019-02-09 14:41:41
3 2019-02-07 11:29:00 2019-02-07 11:29:10

Or write a function if you need more of these columns:

def add_time(dataframe, col, seconds):

    newcol = dataframe[col] + pd.Timedelta(seconds)

    return newcol

temp = pd.DataFrame({'ten_second':add_time(df_event, 0, '10s'),
                    'twenty_second':add_time(df_event, 0, '20s')})

Output

           ten_second       twenty_second
0 2019-01-10 13:16:35 2019-01-10 13:16:45
1 2019-01-29 13:56:31 2019-01-29 13:56:41
2 2019-02-09 14:41:31 2019-02-09 14:41:41
3 2019-02-07 11:29:00 2019-02-07 11:29:10

Or we can do this in one line using assign:

temp = pd.DataFrame().assign(ten_second=df_event[0] + pd.Timedelta('10s'), 
                             twenty_second=df_event[0] + pd.Timedelta('10s'))
Erfan
  • 40,971
  • 8
  • 66
  • 78
  • can you do `assign(ten_sec=..., twenty_sec=...)`? – Quang Hoang Aug 07 '19 at 16:12
  • Yes, we can, didn't know that. I always chain `.assign` @QuangHoang – Erfan Aug 07 '19 at 16:19
  • Thanks for your suggestion, but I really need suggestion about doing looping in function, my real function is't equal with the question but I just change in to simple function. – Ghozally Aug 07 '19 at 17:21
  • The thing is you're wrong, but a comment section is not to place to discuss this. I use pandas daily and I only need loops for edge cases (2 or 3 % of the time). It's probably your lack of knowledge that makes you think you need a loop for your problem. The pandas API provides functions for most of our problems. [Here's](https://stackoverflow.com/a/55557758/9081267) a good read why we DONT want to use loops with pandas. @Ghozally – Erfan Aug 07 '19 at 19:59