1

I have a dataframe like:

data = {'year': [2020, 2020, 2021, 2021], 'week': [52, 53, 1, 2]}
df = pd.DataFrame(data=data)

   year  week
0  2020    52
1  2020    53
2  2021     1
3  2021     2

and I would like to get for each line the Monday (date) of that period so something like:

   year  week  period
0  2020    52  2020-12-21
1  2020    53  2020-12-28
2  2021     1  2021-01-04
3  2021     2  2021-01-11

What is the correct way to do this in pandas?

Benjamin
  • 3,350
  • 4
  • 24
  • 49
  • See if [this answer](https://stackoverflow.com/a/17087427/5386938) helps. –  Apr 09 '21 at 13:54
  • @JustinEzequiel I tried something like this (after conversion in pandas) this but results are wrong. 2 lines in the above example have the same result – Benjamin Apr 09 '21 at 13:56
  • @JustinEzequiel your were right, I did not see the %G-W%V-%u at the end. That solves my issue. Thx – Benjamin Apr 09 '21 at 14:00

2 Answers2

1

The right way to do this is to use ISO date format pattern:

df["period"] = pd.to_datetime(
    df.year.astype(str) + '-W' + df.week.astype(str) + '-1',
    format='%G-W%V-%u')\
    .dt.strftime('%Y-%m-%d')

and not

df["period"] = pd.to_datetime(
    df.year.astype(str) + '-W' + df.week.astype(str) + '-1',
    format='%Y-W%W-%w')\
    .dt.strftime('%Y-%m-%d')

Because of ISO week number (cf comments)

Benjamin
  • 3,350
  • 4
  • 24
  • 49
0

You can try with:

data = {'year': [2020, 2020, 2021, 2021], 'week': [52, 53, 1, 2]}
df = pd.DataFrame(data=data)

df['date combined']=df['year'].astype(str)+'-'+df['week'].astype(str)+ '-1'
df['date of Monday']=pd.to_datetime(df['date combined'], format='%Y-%W-%w')

print(df)

result:

   year  week date combined       date
0  2020    52     2020-52-1 2020-12-28
1  2020    53     2020-53-1 2021-01-04
2  2021     1      2021-1-1 2021-01-04
3  2021     2      2021-2-1 2021-01-11
Renaud
  • 2,709
  • 2
  • 9
  • 24