I have a big dataset of timed solar panel yield logs per ten minutes throughout multiple years. I would like to get a bar chart which displays the monthly summation of these logs per year.
I don't know if this makes sense, so let me explain with an example.
Through resampling, I got the following dataframe:
Last updated
2017-01-31 00:00:00+01:00 24.1
2017-02-28 00:00:00+01:00 44.8
2017-03-31 00:00:00+02:00 140.1
2017-04-30 00:00:00+02:00 168.7
2017-05-31 00:00:00+02:00 194.4
2017-06-30 00:00:00+02:00 214.7
2017-07-31 00:00:00+02:00 204.6
2017-08-31 00:00:00+02:00 159.5
2017-09-30 00:00:00+02:00 117.3
2017-10-31 00:00:00+01:00 74.8
2017-11-30 00:00:00+01:00 30.4
2017-12-31 00:00:00+01:00 7.2
2018-01-31 00:00:00+01:00 9.4
Freq: M, Name: Yield (kWh), dtype: float64
I think the best way of achieving my desired plot is convert this somehow into a dataframe with
- an index containing only months (is this possible with DateTimeIndex, or shall I simply use a RangeIndex from 1 to 12?)
- a column for each year (2016 and 2017 in this example)
- the values reorganised accordingly
So, for example:
Last updated 2017 2018
Jan 24.1 9.4
Feb 44.8 0
Mar 140.1 0
Apr 168.7 0
May 194.4 0
Jun 214.7 0
Jul 204.6 0
Aug 159.5 0
Sep 117.3 0
Oct 74.8 0
Nov 30.4 0
Dec 7.2 0
I can easily achieve this with a some naive looping, but I'm convinced that pandas must hiding some clever trick here from me. Can somebody point me to an efficient solution?
Thanks a lot!