Data import from csv:
Date | Item_1 | Item 2 |
---|---|---|
1990-01-01 | 34 | 78 |
1990-01-02 | 42 | 19 |
. | . | . |
. | . | . |
2020-12-31 | 41 | 23 |
df = pd.read_csv(r'Insert file directory')
df.index = pd.to_datetime(df.index)
gb= df.groupby([(df.index.year),(df.index.month)]).mean()
Issue:
So basically, the requirement is to group the data according to year and month before processing and I thought that the groupby function would have grouped the data so that the mean() calculate the averages of all values grouped under Jan-1990, Feb-1990 and so on. However, I was wrong. The output result in the average of all values under Item_1
My example is similar to the below post but in my case, it is calculating the mean. I am guessing that it has to do with the way the data is arranged after groupby or some parameters in mean() have to be specified but I have no idea which is the cause. Can someone enlighten me on how to correct the code?
Update: Hi all, I have created the sample data file .csv with 3 items and 3 months of data. I am wondering if the cause has to do with the conversion of data into df when it is imported from .csv because I have noticed some weird time data on the leftmost as shown below:
Link to sample file is: https://www.mediafire.com/file/t81wh3zem6vf4c2/test.csv/file