0

I have a data frame which looks like below

Unit #  Start Date  End Date

7417      2/6/2017  3/5/2017

I want to split it by weeks Output should be

Unit #  Start Date  End Date

7417 2/6/2017 2/12/2017
7417 2/13/2017 2/19/2017
7417 2/20/2017 2/26/2017
7417 2/27/2017 3/05/2017

Could someone help me with this ?

Rohith
  • 11
  • 1
  • 3

2 Answers2

5

You didn't write any code, so I won't give you a complete solution.

Here's a way to iterate weeks between two dates:

import datetime

date_format = "%m/%d/%Y"
d1 = datetime.datetime.strptime("2/6/2017", date_format).date()
d2 = datetime.datetime.strptime("3/5/2017", date_format).date()
d = d1
step = datetime.timedelta(days=7)

while d < d2:
    print(d.strftime(date_format))
    d += step
    
# 02/06/2017
# 02/13/2017
# 02/20/2017
# 02/27/2017
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
  • 1
    yeah but it stops at 02/27/2017, if you want to jump to 03/05/2017 and want the time between then you need to make another else after your while and add the enddate there. – Khan Dec 07 '18 at 19:31
2

Supposing you have a dataframe with multiple Unit # you can use following code to achieve what you want:

import datetime
import pandas as pd

df = pd.DataFrame([[7417, "2/6/2017", "3/5/2017"],[7418, "3/6/2017", "4/7/2017"]],
columns = ["Unit #", "Start Date", "End Date"])

# Convert dtaframe to dates
df['Start Date'] = pd.to_datetime(df['Start Date'])
df['End Date'] = pd.to_datetime(df['End Date'])

df_out = pd.DataFrame()
week = 7

# Iterate over dataframe rows
for index, row in df.iterrows():
    date = row["Start Date"]
    date_end = row["End Date"]
    unit = row["Unit #"]
    # Get the weeks for the row
    while date < date_end:
        date_next = date + datetime.timedelta(week - 1)
        df_out = df_out.append([[unit, date, date_next]])
        date = date_next + datetime.timedelta(1)

# Remove extra index and assign columns as original dataframe
df_out = df_out.reset_index(drop=True)
df_out.columns = df.columns

So if your input dataframe is:

>>> df
   Unit #          Start Date            End Date
0    7417 2017-02-06 00:00:00 2017-03-05 00:00:00
1    7418 2017-03-06 00:00:00 2017-04-07 00:00:00

Output df_out would look as:

>>> df_out
   Unit #           Start Date             End Date
0    7417  2017-02-06 00:00:00  2017-02-12 00:00:00
1    7417  2017-02-13 00:00:00  2017-02-19 00:00:00
2    7417  2017-02-20 00:00:00  2017-02-26 00:00:00
3    7417  2017-02-27 00:00:00  2017-03-05 00:00:00
4    7418  2017-03-06 00:00:00  2017-03-12 00:00:00
5    7418  2017-03-13 00:00:00  2017-03-19 00:00:00
6    7418  2017-03-20 00:00:00  2017-03-26 00:00:00
7    7418  2017-03-27 00:00:00  2017-04-02 00:00:00
8    7418  2017-04-03 00:00:00  2017-04-09 00:00:00
Cedric Zoppolo
  • 4,271
  • 6
  • 29
  • 59