So I have a list of generator generation in excel:
Gen1 Gen2 Gen3 Gen4
12 88 32 13
22 21 71 11
22 354 32 74
43 21 65 11
33 21 44 11
55 22 32 43
44 21 23 11
12 21 32 77
76 33 67 11
123 55 32 67
32 21 32 14
49 76 78 11
54 21 32 11
33 21 32 44
22 44 998 43
77 21 32 32
09 34 55 11
98 21 32 14
33 21 44 16
21 23 22 341
There are 20 values
in each column and I wan to reduce to 5 values
per column, which will be done by taking the average of 4 values
at a time. So it will be like taking the average of 1-4 elements first then 5-8, 9-12, 13-16 and last 17-20 elements.
The expected output will look like this:
Gen1 Gen2 Gen3 Gen4
24.75 121 50 27.25
36 21.25 32.75 35.5
70 46.25 52.25 25.75
46.5 26.75 273.5 32.5
40.25 24.75 38.25 95.5
I know how to do this in python and my approach is this which gives the exact output that I want:
import numpy as np
gen1 = np.array(df['Gen1'])
gen2 = np.array(df['Gen2'])
gen3 = np.array(df['Gen3'])
gen4 = np.array(df['Gen4'])
solutionArr = np.zeros((5,4))
gen1List = []
gen2List = []
gen3List = []
gen4List = []
for i in range(5):
gen1List.append(np.mean(gen1[i*4:(i+1)*4]))
gen2List.append(np.mean(gen2[i*4:(i+1)*4]))
gen3List.append(np.mean(gen3[i*4:(i+1)*4]))
gen4List.append(np.mean(gen4[i*4:(i+1)*4]))
solutionArr[:,0] = gen1List
solutionArr[:,1] = gen2List
solutionArr[:,2] = gen3List
solutionArr[:,3] = gen4List
print(solutionArr)
But I want to use pandas for this because in real dataset I have more 140 Generators with very different names(not like gen1, gen2 etc). So can anyone please help with this?