I have a time series data frame for eight Years (2013-2020) has Hourly data, each Year has Nine zones, under each zone two columns("Gen", "Load") as follows:
A ZONE B ZONE ... G ZONE H ZONE I ZONE
date_time GEN LOAD GEN LOAD ... LOAD GEN LOAD GEN LOAD
2013-01-01 00:00:00 725.7 5,859.5 312.2 3,194.7 ... 77.1 706.0 227.1 495.0 861.9
2013-01-01 01:00:00 436.2 450.5 248.0 198.0 ... 865.5 240.7 107.9 640.5 767.3
2013-01-01 02:00:00 464.5 160.2 144.2 068.3 ... 738.7 044.7 32.7 509.3 700.4
2013-01-01 03:00:00 169.9 733.8 268.1 869.5 ... 671.7 649.4 951.3 626.8 652.1
2013-01-01 04:00:00 145.4 553.4 280.2 872.8 ... 761.5 561.0 912.9 552.1 637.3
... ... ... ... ... ... ... ... ... ... ... ...
2020-12-31 19:00:00 450.9 951.7 371.4 516.3 ... 461.7 808.9 471.4 983.7 447.8
2020-12-31 20:00:00 553.0 936.5 848.7 233.9 ... 397.3 978.3 404.3 490.9 233.0
2020-12-31 21:00:00 458.6 735.6 716.8 121.7 ... 385.1 808.0 192.0 131.5 70.1
2020-12-31 22:00:00 515.8 651.6 693.5 142.4 ... 291.4 826.1 16.8 591.9 863.2
2020-12-31 23:00:00 218.6 293.4 448.2 14.2 ... 340.6 435.0 897.4 622.5 768.3
What I want is the following:
1- Detect outliers in each column which is more or less three time Standard Deviation of that column and put it in a new column its name "A_gen_outliers" if the there is outliers in "GEN"column under "A Zone" as well as "A_load_outliers" if the there is outliers in "LOAD"column under "A Zone". Number of new columns are 18 columns.
2- A new column represents sum of "Gen" columns
3- A new column represents sum of "Load" columns
4- A new column represents "GEN" column calculate A_GEN_div = cell value/maximum value of "GEN column under A Zone for each year for example 725.7/725.7=1 for the first cell and 436.2/725.1 for second cell and for last cell 218.6/553. etc. and the same for all "GEN" columns and also for "LOAD" columns- proposed names "A_Load_div".
Number of new columns are 18 columns.
Number of total new columns are "18 *2 + 2" columns
Thanks in advance.