Reformat Monthly Measurements into Rows for Each Month

Question

I have a Pandas DataFrame with monthly data points about the climate for about 30 years, and am trying to reformat the DataFrame into a format that another model that will use the data can accept. I am wondering if there is a simple way to reformat a time series data all within a single column into several columns that each represent a month. I'll give example formats below for clarity:

Current Format:

Date	HDD	CDD
2010-01-01	3000	0
2010-02-01	2500	500
2010-03-01	2000	1000
2010-04-01	1500	1500
2010-05-01	1000	2000
2010-06-01	500	2500
2010-07-01	0	3000
2010-08-01	500	2500
2010-09-01	1000	2000
2010-10-01	1500	1500
2010-11-01	2000	1000
2010-12-01	2500	500
2011-01-01	3000	0
2011-02-01	2500	500
2011-02-01	2000	1000
...	...	...

Desired Format:

Date	Jan-HDD	Jan-CDD	Feb-HDD	Feb-CDD	Mar-HDD	Mar-CDD	...
2010	3000	0	2500	500	2000	1000	...
2011	3000	0	1500	500	2000	1000	...
...	...	...	...	...	...	...	...

Let me know if you have any suggestions, thank you!

Below is a code snippet of what I've tried, which leads to errors or a blank DataFrame depending on the parameters passed to the merge function. The code below is just a test, but I was intending to put it in a for loop across each month if it had worked.

month = 'January'
ddf = dd[dd['Date'].dt.month_name() == month].rename(columns={'HDD': '{}-HDD'.format(month), 'CDD': '{}-CDD'.format(month)})

month = 'February'
ddf_temp = dd[dd['Date'].dt.month_name() == month]
ddf_temp = ddf_temp.rename(columns={'HDD': '{}-HDD'.format(month), 'CDD': '{}-CDD'.format(month)})
##ddf = ddf.merge(ddf_temp, on='Date', how='outer')
#ddf

pd.merge(ddf, ddf_temp, on='Date')

https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe — eng, Jul 19 '23 at 21:03
https://stackoverflow.com/questions/28337117/how-to-pivot-a-dataframe-in-pandas — eng, Jul 19 '23 at 21:03

score 1 · Accepted Answer · answered Jul 21 '23 at 13:41

I'm assuming that the last row in your current format table is a typo and should be 2011-03-01.

import pandas as pd

df = pd.DataFrame(
    [
        ("2010-01-01", "3000", "0"),
        ("2010-02-01", "2500", "500"),
        ("2010-03-01", "2000", "1000"),
        ("2010-04-01", "1500", "1500"),
        ("2010-05-01", "1000", "2000"),
        ("2010-06-01", "500", "2500"),
        ("2010-07-01", "0", "3000"),
        ("2010-08-01", "500", "2500"),
        ("2010-09-01", "1000", "2000"),
        ("2010-10-01", "1500", "1500"),
        ("2010-11-01", "2000", "1000"),
        ("2010-12-01", "2500", "500"),
        ("2011-01-01", "3000", "0"),
        ("2011-02-01", "2500", "500"),
        ("2011-03-01", "2000", "1000"),
    ],
    columns=["Date", "HDD", "CDD"],
)
df["Date"] = pd.to_datetime(df["Date"])
df = pd.pivot_table(df, values=["HDD", "CDD"], index=df["Date"].dt.year, columns=df["Date"].dt.month)
print(df)

Output:

      CDD                         HDD
Date   1      2       3            1       2       3
2010  0.0  500.0  1000.0  ...  3000.0  2500.0  2000.0  ...
2011  0.0  500.0  1000.0  ...  3000.0  2500.0  2000.0  ...

Reformat Monthly Measurements into Rows for Each Month

1 Answers1