Separating a dataframe by date and calculating mathmetical models Numpy Python

Question

The data_list and the monthly_values array are in correlation with each other, so the data point '2019-09-01 00:00:00'= 15 , 2019-10-01 00:00:00'= 39.6... etc. The year_changes function below shows the indexes where a new year has occurred. . So since there are 4 months present in 2019 2019-09-01 00:00:00 - 2020-01-01 00:00:00 it takes in the sum of the numbers 15., 39.6, 0.2, 34.3 and divides by the number of months in 2019 which is 4 resulting in the Expected Output of 22.28. But instead of that I am trying to make a chart that shows the mean, median, max ,min How would I be able to code such a thing?

import numpy as np
import pandas as pd
from pandas import DataFrame

date_list = ['2019-09-01 00:00:00', '2019-10-01 00:00:00', '2019-11-01 00:00:00',
 '2019-12-01 00:00:00', '2020-01-01 00:00:00', '2020-02-01 00:00:00', 
 '2020-03-01 00:00:00', '2020-04-01 00:00:00', '2020-05-01 00:00:00', 
 '2020-06-01 00:00:00', '2020-07-01 00:00:00', '2020-08-01 00:00:00',
 '2020-09-01 00:00:00','2020-10-01 00:00:00', '2020-11-01 00:00:00', 
 '2020-12-01 00:00:00','2021-01-01 00:00:00','2021-02-01 00:00:00', '2021-03-01 00:00:00', 
 '2021-04-01 00:00:00','2021-05-01 00:00:00', '2021-06-01 00:00:00', 
 '2021-07-01 00:00:00']
monthly_values = np.array([ 15., 39.6, 0.2, 34.3, 19.6, 26.8, 15.7, 26., 12.6, 15.5, 18.6, 2.3, 6.5,
   2.5, 12.2, 11.6, 93.9, 25.5, 26.5, -16.5, -1.4, -1.8, 5.])

data = pd.DataFrame({"Date": date_list, "Averages": monthly_values})
data["Date"] = pd.to_datetime(data["Date"])
print(data.groupby(data["Date"].dt.year).mean())

Output:

       Averages
Date           
2019  22.275000
2020  14.158333
2021  18.742857

Expected Output:

       Averages    Median    Max    Min
Date           
2019  22.275000    24.65     39.6   0.2
2020  14.158333    14.05     93.9  -16.5
2021  18.742857    5.00      26.8   2.3

Does this answer your question? [Multiple aggregations of the same column using pandas GroupBy.agg()](https://stackoverflow.com/questions/12589481/multiple-aggregations-of-the-same-column-using-pandas-groupby-agg) — Nk03, Jul 05 '21 at 06:45
Hi! Did your query solved? if so then try considering [accepting](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) to signal others that the issue is resolved. If not, you can provide feedback so that the answer can be improved (or removed) — Anurag Dabas, Aug 14 '21 at 07:50

Anurag Dabas · Answer 1 · 2021-07-05T06:53:12.703

Try via groupby(),agg(),droplevel() and rename():

out=(data.groupby(data["Date"].dt.year)
     .agg(['mean','median','max','min'])
     .droplevel(0,1)
     .rename(columns=lambda x:'Average' if x=='mean' else x.title()))

OR

via pivot_table(),droplevel() and rename():

out=(data.pivot_table('Averages',data["Date"].dt.year,aggfunc=['mean','median','max','min'])
         .droplevel(1,1)
         .rename(columns=lambda x:'Average' if x=='mean' else x.title()))

output of out:

         Average    Median  Max     Min
Date                
2019    22.275000   24.65   39.6    0.2
2020    14.158333   14.05   26.8    2.3
2021    18.742857   5.00    93.9    -16.5

Separating a dataframe by date and calculating mathmetical models Numpy Python

1 Answers1

Linked