I want to transform the my dataframe as below. I want to do it using the loop as I will not know how many years of data I will have in real-time data, this is just a sample
data :
Year,Country,state,sex,dist,population
2019,IND,AP,F,EG,123
2019,IND,AP,F,WG,123
2019,IND,AP,F,VZA,96
2019,IND,AP,F,BZA,172
2019,IND,AP,M,EG,101
2019,IND,AP,M,WG,174
2019,IND,AP,M,VZA,86
2019,IND,AP,M,BZA,129
2019,IND,UP,F,A,107
2019,IND,UP,F,B,112
2019,IND,UP,M,C,165
2019,IND,UP,M,D,99
2020,IND,AP,F,EG,77
2020,IND,AP,F,WG,123
2020,IND,AP,F,VZA,116
2020,IND,AP,F,BZA,79
2020,IND,AP,M,EG,110
2020,IND,AP,M,WG,144
2020,IND,AP,M,VZA,93
2020,IND,AP,M,BZA,132
2020,IND,UP,F,A,150
2020,IND,UP,F,B,100
CODE :
serial=df.Year.unique()
global a
a=[]
for i in range(len(serial)):
globals()['serial%s' % i]=df.loc[df.SERIAL==serial[i]]
a.append('serial%s' % i)
k=pd.concat(a,axis=1,join='inner')
I have tried like this but it is not working because variable "a" is the list of strings, so concat
is throwing error
and not sure how to rename name column names[Year to 2020,2019]in the loop as pictured.
can you help to figure out