The data_list
and the monthly_values
array are in correlation with each other, so the data point '2019-09-01 00:00:00'= 15 , 2019-10-01 00:00:00'= 39.6... etc
. The year_changes
function below shows the indexes where a new year has occurred. . So since there are 4 months present in 2019 2019-09-01 00:00:00 - 2020-01-01 00:00:00
it takes in the sum of the numbers 15., 39.6, 0.2, 34.3
and divides by the number of months in 2019 which is 4 resulting in the Expected Output
of 22.28
. But instead of that I am trying to make a chart that shows the mean, median, max ,min
How would I be able to code such a thing?
import numpy as np
import pandas as pd
from pandas import DataFrame
date_list = ['2019-09-01 00:00:00', '2019-10-01 00:00:00', '2019-11-01 00:00:00',
'2019-12-01 00:00:00', '2020-01-01 00:00:00', '2020-02-01 00:00:00',
'2020-03-01 00:00:00', '2020-04-01 00:00:00', '2020-05-01 00:00:00',
'2020-06-01 00:00:00', '2020-07-01 00:00:00', '2020-08-01 00:00:00',
'2020-09-01 00:00:00','2020-10-01 00:00:00', '2020-11-01 00:00:00',
'2020-12-01 00:00:00','2021-01-01 00:00:00','2021-02-01 00:00:00', '2021-03-01 00:00:00',
'2021-04-01 00:00:00','2021-05-01 00:00:00', '2021-06-01 00:00:00',
'2021-07-01 00:00:00']
monthly_values = np.array([ 15., 39.6, 0.2, 34.3, 19.6, 26.8, 15.7, 26., 12.6, 15.5, 18.6, 2.3, 6.5,
2.5, 12.2, 11.6, 93.9, 25.5, 26.5, -16.5, -1.4, -1.8, 5.])
data = pd.DataFrame({"Date": date_list, "Averages": monthly_values})
data["Date"] = pd.to_datetime(data["Date"])
print(data.groupby(data["Date"].dt.year).mean())
Output:
Averages
Date
2019 22.275000
2020 14.158333
2021 18.742857
Expected Output:
Averages Median Max Min
Date
2019 22.275000 24.65 39.6 0.2
2020 14.158333 14.05 93.9 -16.5
2021 18.742857 5.00 26.8 2.3