1

I search in previous questions, and it does not resolve what i am searching, please can u help me

I have a dataset from

    Date         T2M    Y       T       F        H       G     Week_Number
0   1981-01-01  11.08   17.35   6.94    0.00    5.37    4.63    1
1   1981-01-02  10.82   16.41   7.51    0.00    5.55    2.73    1
2   1981-01-03  10.74   15.64   7.35    0.00    6.23    2.33    1
3   1981-01-04  11.17   15.99   8.46    0.00    6.16    1.66    1
4   1981-01-05  10.20   15.60   6.87    0.12    6.10    2.78    2
5   1981-01-06  10.35   16.16   5.95    0.00    6.59    3.92    2
6   1981-01-07  12.26   18.24   9.30    0.00    6.10    2.30    2
7   1981-01-08  12.76   19.23   8.72    0.00    6.29    3.96    2
8   1981-01-09  12.61   17.80   8.90    0.00    6.71    2.05    2

I already created a column of the week number using this code

df['Week_Number'] = df['Date'].dt.week

but it gives me only the first four days of the year that design the first week, maybe it means that the week start from monday. In my cases I don t give interest if it start from monday or another day, I just want to subdivise each year every seven days (group every 7 days of each year like from 1 1 1980 to 07 1 1980 FISRT WEEK, and go on, and every next year the first week starts too from 1 1 xxxx

meoza
  • 15
  • 9
  • if you want the week count always start from the first of Jan, simply take the period and divide by 7... example: (date_of interest - 01/01/2020)/7 just decide about rounding, should be always round Up if I am not mistaken – NoobVB May 22 '22 at 17:54

1 Answers1

1

If you want your week numbers to start from the 1st of January, irrespective of the day of week, simply get the day of year, subtract 1 and compute the integer division by 7:

df['Date'] = pd.to_datetime(df['Date'])
df['week_number'] = df['Date'].dt.dayofyear.sub(1).floordiv(7).add(1)

NB. you do not need to add 1 if you want the first week to start with 0

output:

        Date    T2M      Y     T     F     H     G  Week_Number  week_number
0 1981-01-01  11.08  17.35  6.94  0.00  5.37  4.63            1            1
1 1981-01-02  10.82  16.41  7.51  0.00  5.55  2.73            1            1
2 1981-01-03  10.74  15.64  7.35  0.00  6.23  2.33            1            1
3 1981-01-04  11.17  15.99  8.46  0.00  6.16  1.66            1            1
4 1981-01-05  10.20  15.60  6.87  0.12  6.10  2.78            2            1
5 1981-01-06  10.35  16.16  5.95  0.00  6.59  3.92            2            1
6 1981-01-07  12.26  18.24  9.30  0.00  6.10  2.30            2            1
7 1981-01-08  12.76  19.23  8.72  0.00  6.29  3.96            2            2
8 1981-01-09  12.61  17.80  8.90  0.00  6.71  2.05            2            2

Then you can use the new column to groupby, for example:

df.groupby('week_number').agg({'Date': ['min', 'max'], 'T2M': 'sum'})

output:

                  Date               T2M
                   min        max    sum
week_number                             
1           1981-01-01 1981-01-07  76.62
2           1981-01-08 1981-01-09  25.37
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Thank you, it worked for me, but for aggregation I want to agregate the first week in each year seperatly, I mean sum week value of week 1 in year 1981 and the week 2 ... and the year 1982 sum the values of week 1 and so on – meoza May 22 '22 at 20:25
  • it is similar to aggregating the value weekly – meoza May 22 '22 at 20:26
  • Then group by year and week number – mozway May 22 '22 at 20:26
  • can you please add it in the response , I tried this code to group but it did not workdf.groupby('Date','week_number').agg({'Date': ['min', 'max'], 'T2M': 'sum'}) – meoza May 22 '22 at 20:30
  • `df.groupby([df['Date'].dt.year, 'week_number']).agg({'Date': ['min', 'max'], 'T2M': 'sum'})` – mozway May 23 '22 at 04:51
  • hello, well when i tried the first code ((((df['week_number'] = df['Date'].dt.dayofyear.sub(1).floordiv(7).add(1) ))) i saw this warning (((((usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead ))))))) I don t know what to change – meoza May 25 '22 at 10:37
  • you should read [this Q/A](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas), this warning might be due to prior code. Try `df = df.copy()` before running your code, does it disappear? – mozway May 25 '22 at 10:39