4

I want to add 1 day every 3 rows.

The date

start_date = "01/02/21"

date_1 = datetime.datetime.strptime(start_date, "%d/%m/%y")
end_date = date_1 + datetime.timedelta(days=1)
df_4["date"] = end_date

The wanted output

A   date
1   01/02/21
2   01/02/21
3   01/02/21
4   02/02/21 # add 1 day
5   02/02/21
6   02/02/21
7   03/02/21 # add 1 day
8   03/02/21
9   03/02/21
10  04/02/21 # add 1 day
11  04/02/21
12  04/02/21
...

Now it adds 1 day to all rows and not one more day every 3 rows

Mathieu
  • 797
  • 2
  • 6
  • 24
  • How looks input data? – jezrael Feb 03 '21 at 09:28
  • df_4 has multiple columns and multiple rows. I just want to add a date for each rows – Mathieu Feb 03 '21 at 09:30
  • it means add 1 day row? Or add 1 day for increement day? – jezrael Feb 03 '21 at 09:31
  • increement day I guess. Add one day from the three previous one rows. – Mathieu Feb 03 '21 at 09:32
  • Are you sure? So input data sample has 12 rows? – jezrael Feb 03 '21 at 09:33
  • Sorry, I'm not a good english speaker. Let me clarify. Let's say the starting date is 01/01/2021. The three first row will have 01/01/2021 as a date. Then the three next rows will have + 1 day = 02/01/2021, then the three next rows will have + 1 day = 03/01/2021 etc.... (the input data has thousands of rows). So it can go to many future years (and more) – Mathieu Feb 03 '21 at 09:36
  • So inout is empty column? And input is `01/01/2021` ? – jezrael Feb 03 '21 at 09:39

2 Answers2

2

If need each 3 values starting by start_date add timedeltas days generated by np.arange with length of DataFrame what is faster like loop solutions:

start_date = "01/02/21"

date_1 = datetime.datetime.strptime(start_date, "%d/%m/%y")
df["date"] = date_1 + pd.to_timedelta(np.arange(len(df)) // 3, unit='d')


print (df)
     A       date Note
0    1 2021-02-01  NaN
1    2 2021-02-01  NaN
2    3 2021-02-01  NaN
3    4 2021-02-02  add
4    5 2021-02-02  NaN
5    6 2021-02-02  NaN
6    7 2021-02-03   ad
7    8 2021-02-03  NaN
8    9 2021-02-03  NaN
9   10 2021-02-04  add
10  11 2021-02-04  NaN
11  12 2021-02-04  NaN

Details:

print (np.arange(len(df)) // 3)
[0 0 0 1 1 1 2 2 2 3 3 3]

print (pd.to_timedelta(np.arange(len(df)) // 3, unit='d'))
TimedeltaIndex(['0 days', '0 days', '0 days', '1 days', '1 days', '1 days',
                '2 days', '2 days', '2 days', '3 days', '3 days', '3 days'],
               dtype='timedelta64[ns]', freq=None)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

This code snippet below will increment the date by one day in the row after every 3 rows (as specified by the add_every_x_rows)

df['Date'] = [datetime.datetime.strptime(str(x), "%d/%m/%y").date() for x in df['Date']]

add_every_x_rows = 3

day_counter = 0
for i,row in df.iterrows():
    day_counter+=1
    if (day_counter == add_every_x_rows +1):
        df.at[i,'Date'] = row['Date'] + datetime.timedelta(days=1)
        day_counter = 1

If you later change the value of add_every_x_rows to 4, it will start incrementing the date by one day at every four rows.

Ishwar Venugopal
  • 872
  • 6
  • 17
  • 1
    Thank you for your help. the accepted answer is an easier way to do it for me. But thanks anyway – Mathieu Feb 03 '21 at 09:47
  • 1
    Maybe you can avoid `iterrows`, https://stackoverflow.com/questions/24870953/does-pandas-iterrows-have-performance-issues/24871316#24871316 – jezrael Feb 03 '21 at 10:43