1

Hi I have the following data:
index, Day of week, Week no, Fecha

360      Friday       52 2019-12-27
361    Saturday       52 2019-12-28
362      Sunday       53 2019-12-29
363      Monday       53 2019-12-30
364     Tuesday       53 2019-12-31
365   Wednesday        1 2020-01-01
366    Thursday        1 2020-01-02
367      Friday        1 2020-01-03
368    Saturday        1 2020-01-04
369      Sunday        2 2020-01-05
370      Monday        2 2020-01-06

I would like:
-the week that contains the first of January to be week no 1
-to have the weeks start on Sunday
-to have week no 1 as a full week of 7 days, meaning the 29th, 30th and 31th of December to also get week no 1.
-to get this to work also when I have a lot of years in this data set.

In this particular year it means changing all 53's to 1's but I think there might be other years where this won't work. So to get a general rule, I realized if the first of January falls on a Sunday I don't need to change anything so I thought to first check that for every year and if the first of January doesn't fall on a Sunday to change all the week no's between the previous Sunday and that Sunday to 1. Another option I thought of is to find out what week no the previous Sunday has and then to change all week no's of that year with the same number as the previous Sunday, to 1. For both I would need to do a condition within a df to only filter out certain rows but how do I do that when I only want to display one column of that df? Meaning if I would do:

totals[(totals['Fecha'].dt.month==1) & (totals['Fecha'].dt.day==1) & (totals['Fecha'].dt.year==i)]

then this would show all columns in totals while I would want and these conditions and to only see column 'Week day'.

So how would I do that and also, this all sounds super complicated to me. Is there an easier/more efficient way to this that I overlooked?

Thank you!

Pythn
  • 171
  • 2
  • 10
  • Does this answer your question? [How can I find week numbers with weeks starting on Sunday in Python?](https://stackoverflow.com/questions/28305034/how-can-i-find-week-numbers-with-weeks-starting-on-sunday-in-python) – pink spikyhairman May 03 '20 at 11:15

4 Answers4

1

So this is what I've come up with in the end. How is this performance wise?

totals['Fecha']=pd.to_datetime(totals['Fecha'], format='%d/%m/%Y') #change type to datetime
totals['Day of week']=totals['Fecha'].dt.weekday_name   #create day of week 'Sunday, Monday, etc'
totals['Week no']=totals['Fecha'].dt.strftime('%U').astype(int)+1 #create week no's with Sunday as first day of week

for i in set(totals['Fecha'].dt.year):
    if i!=2019: #because for the first year we don't have a previous end of year
        first_day_of_year=str(i)+'-01-01' 
        # if there are any rows where the day of the week of the first day of the year equals 'Sunday'
        if any(totals['Day of week'].where(totals['Fecha']==first_day_of_year)!='Sunday'):

        # then for the year before, change all the last week no's to one
            last_week=max(totals['Week no'].where(totals['Fecha'].dt.year==i-1))
            totals.loc[(totals['Week no']==last_week)&(totals['Fecha'].dt.year==i-1), 'Week no']=1

print(totals[['Day of week', 'Week no', 'Fecha']])
Pythn
  • 171
  • 2
  • 10
0

You can use the mod operator. This will give you the remainder after dividing by a given number. Therefor, 52 % 52 = 0 and 0 % 52 = 0. Mod only really works when you start counting from 0, so you'll have to minus one fisrt, see below:

my_week = 53
my_bounded_week = ((my_week - 1) % 52) + 1
# First minus one to make the series start at 0.
# Then add one after the mod to make the series start at 1

print(my_bounded_week)
# prints 1
GTBebbo
  • 1,198
  • 1
  • 8
  • 17
  • Thank you. Isn't that the same as just replacing all 53's to 1? I'm not sure if that will work because if there will be a year where February has 29 days AND the first of January falls on a Saturday, meaning the 2nd of January is a Sunday and week 2, than that year will have 54 week no's so the 53's need to stay as the are and only the 54's change to 1. Confusing! – Pythn May 03 '20 at 12:15
  • In that case you would have to use the mod of the `maximum number of weeks - 1`. So once you've finished calculating the weeks, go backwards through the year modding the final week – GTBebbo May 03 '20 at 12:19
  • That wouldn't work either because if the first of january is a Sunday, you would want to leave all dates of december as the last week of the year. – Pythn May 03 '20 at 18:07
0

Use the datetime package as described in this StackOverflow answer: How can I find week numbers with weeks starting on Sunday in Python?

pink spikyhairman
  • 2,391
  • 1
  • 16
  • 13
  • I have been using this package and that is how I've gotten the current 'Week no' column but as you see it turns out a little different from what I need. If you know a function that turns all days in that week to 1 (even the ones in december) I would love to hear about that. Thanks! – Pythn May 03 '20 at 12:06
0

Seems like you need your own custom business calendar, we can use a small function to create one.

Assuming you're creating a calendar starting on the first calendar day of each calendar year then this will work.

One caveat is that I've not written this for multiple years, I'll leave that up to you :)

Usage

df = business_cal('01-01-2019','01-01-2020')

print(df.head(5))

        date  weeks  dayofmonth  dayofweek daynameofweek
0 2018-12-30      1          30          6        Sunday
1 2018-12-31      1          31          0        Monday
2 2019-01-01      1           1          1       Tuesday
3 2019-01-02      1           2          2     Wednesday
4 2019-01-03      1           3          3      Thursday

Function.

def business_cal(start,end):
    """
    Function that returns a calendar year given a start and end date.
    Constrains - week must start on Sunday if 01/01/2020 is not Sunday,
    we take the last Sunday of the previous year.
    """
    start_date = pd.to_datetime(start)
    
    if start_date.weekday() != 6:
        start_date = start_date - pd.DateOffset(days=(start_date.weekday() + 1))
    else:
        start_date


    dates = pd.date_range(start_date,end,freq='7D')
    
    df = pd.DataFrame(dates,columns=['date'])
    # grab week numbers.
    df['weeks'] = df.index + 1 
    df1 = df.set_index('date').resample('D').ffill().reset_index()
    
    df1['dayofmonth'] = df1['date'].dt.day
    df1['dayofweek'] = df1['date'].dt.dayofweek
    df1['daynameofweek'] = df1['date'].dt.day_name()
    return df1
Community
  • 1
  • 1
Umar.H
  • 22,559
  • 7
  • 39
  • 74