0

I have a dataframe of downsampled Open/High/Low/Last/Change/Volume values for a security over ten years. I'm trying to get the weekly count of samples i.e. how many samples did my downsampling method, in this case a Volume bar, sample per week over the entire dataset so that I can plot it and compare to other downsampling methods.

So far I've tried creating a series in the df called 'Year-Week' following the answers prescribed here and here. The problem with these answers is that my EOY dates such as '1997-12-30' get transformed to '1997-01' because of the ISO calendar system used as described in this answer, which breaks my results when I apply the value_counts method.

My code is the following:

volumeBar['Year/Week'] = (pd.Series(volumeBar.index).dt.year.astype(str) + "/" + pd.Series(volumeBar.index).dt.week.astype(str)).values

So my question is: As it stand the following sample DateTimeIndex

Date
1997-12-22
1997-12-29
1997-12-30

becomes

Year/Week
1997/52
1997/1
1997/1

How could I get the following expected result?

Year/Week
1997/52
1997/52
1997/52

Please keep in mind that I cannot manually correct this behavior because of the size of the dataset and the erradict nature of these appearing results due to the way the ISO calendar works.

Many thanks in advance!

FranciscoRZ
  • 72
  • 1
  • 11
  • But will you really be OK with the last week of 1997 having 10 days? – Michał Politowski Mar 29 '19 at 10:49
  • For the purposes of this study it shouldn't be too much of a problem. To be honest the 52 could be a 53 and it wouldn't bother me, I just don't want EOY results turning into beginning of the same year results – FranciscoRZ Mar 29 '19 at 10:56

1 Answers1

0

You can use the below function get_years_week to get years and weeks without ISO formating.

import pandas as pd
import datetime

a = {'Date': ['1997-11-29', '1997-12-22',
'1997-12-29',
'1997-12-30']}

data  = pd.DataFrame(a)

data['Date'] = pd.to_datetime(data['Date'])

# Function for getting weeks and years
def get_years_week(data):

     # Get year from date
     data['year'] = data['Date'].dt.year

     # loop over each row of date column and get week number
     for i in range(len(data)):
         data['week'] = (((data['Date'][i] - datetime.datetime\
                          (data['Date'][i].year,1,1)).days // 7) + 1)

     # create column for week and year
     data['year/week'] = pd.Series(data_2['year'].astype('str'))\
                         + '/' + pd.Series(data_2['week'].astype('str'))  
     return data
FranciscoRZ
  • 72
  • 1
  • 11
user3432888
  • 131
  • 1
  • 1
  • 11
  • I'm getting year/month with your function, and the month is always equals to 12.. – FranciscoRZ Mar 29 '19 at 11:09
  • This function will return the week and year for the date. Are you trying to get month or your getting month and year when you pass your data to this function. As i have created a sample dataframe you can take that as reference and format your date according to that. – user3432888 Mar 29 '19 at 11:41
  • My bad, I meant to say that the result is always YEAR/12, with 12 being the "week" – FranciscoRZ Mar 29 '19 at 13:37
  • Can you please show me the date your inputting and the result your getting. – user3432888 Mar 29 '19 at 13:43