0

I'm using pandas and have 3 columns of data, containing a day, a month, and a year. I want to input my numbers into a loop so that I can create a new column in my dataframe that shows the week number. My data also starts from October 1, and I need this to be my first week.

I've tried using this code:

for (a,b,c) in zip(year, month, day):

      print(datetime.date(a, b, c).strftime("%U"))

But this assumes that the first week is in January. I'm also unsure how to assign what's in the loop to a new column. I was just printing what was in the for loop to test it out.

Thanks

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
Tony
  • 31
  • 3

1 Answers1

0

I think this is what you want :

import pandas as pd
import datetime

# define a function to get the week number according to January 
get_week_number = lambda y, m, d : int(datetime.date(y, m, d).strftime('%U'))

# get the week number for October 1st; the offset
offset = get_week_number(2021, 10, 1)

def compute_week_number(year, month, day):
  """
  Function that computes the week number with an offset 
  October 1st becomes week number 1
  """
  return get_week_number(year, month, day) - offset + 1

df = pd.DataFrame({'year':[2021, 2021, 2021], 
                   'month':[10, 10, 10], 
                   'day':[1, 6, 29]})

df['week_number'] = df.apply(lambda x: compute_week_number(x['year'], 
                                                           x['month'],
                                                           x['day']), 
                                                           axis=1)

apply with the use of axis=1 allows to call a function for each line of the dataframe to return the value of the new column we want to compute for this line.

I used % (modulo) to compute the new week number according to what you asked for.

Week 39 becomes week 1, week 40 becomes week 2 and so on.

This gives :

year month day week_number
2021 10 1 1
2021 10 6 2
2021 10 29 5
DavidK
  • 2,495
  • 3
  • 23
  • 38