I have a dataframe with a bunch of columns labelled in 'YYYY-MM' format, along with several other columns. I need to collapse the date columns into calendar quarters and take the mean; I was able to do it manually, but there are a few hundred date columns in my real data and I'd like to not have to map every single one of them by hand. I'm generating the initial df from a CSV; I didn't see anything in read_csv
that seemed like it would help, but if there's anything I can leverage there that would be great. I found dataframe.dt.to_period("Q")
that will convert a datetime object to quarter, but I'm not quite sure how to apply that here, if I can at all.
Here's a sample df (code below):
foo bar 2016-04 2016-05 2016-06 2016-07 2016-08
0 6 5 3 3 5 8 1
1 9 3 6 9 9 7 8
2 8 5 8 1 9 9 4
3 5 8 1 2 3 5 6
4 4 5 1 2 7 2 6
This code will do what I'm looking for, but I had to generate mapping by hand:
mapping = {'2016-04':'2016q2', '2016-05':'2016q2', '2016-06':'2016q2', '2016-07':'2016q3', '2016-08':'2016q3'}
df = df.set_index(['foo', 'bar']).groupby(mapping, axis=1).mean().reset_index()
New df:
foo bar 2016q2 2016q3
0 6 5 3.666667 4.5
1 9 3 8.000000 7.5
2 8 5 6.000000 6.5
3 5 8 2.000000 5.5
4 4 5 3.333333 4.0
Code to generate the initial df:
df = pd.DataFrame(np.random.randint(1, 11, size=(5, 7)), columns=('foo', 'bar', '2016-04', '2016-05', '2016-06', '2016-07', '2016-08')) '2016-07', '2016-08'))