0

I want to transform the my dataframe as below. I want to do it using the loop as I will not know how many years of data I will have in real-time data, this is just a sample
enter image description here

data :

Year,Country,state,sex,dist,population
2019,IND,AP,F,EG,123
2019,IND,AP,F,WG,123
2019,IND,AP,F,VZA,96
2019,IND,AP,F,BZA,172
2019,IND,AP,M,EG,101
2019,IND,AP,M,WG,174
2019,IND,AP,M,VZA,86
2019,IND,AP,M,BZA,129
2019,IND,UP,F,A,107
2019,IND,UP,F,B,112
2019,IND,UP,M,C,165
2019,IND,UP,M,D,99
2020,IND,AP,F,EG,77
2020,IND,AP,F,WG,123
2020,IND,AP,F,VZA,116
2020,IND,AP,F,BZA,79
2020,IND,AP,M,EG,110
2020,IND,AP,M,WG,144
2020,IND,AP,M,VZA,93
2020,IND,AP,M,BZA,132
2020,IND,UP,F,A,150
2020,IND,UP,F,B,100

CODE :

serial=df.Year.unique()
global a
a=[]

for i in range(len(serial)):
  
    globals()['serial%s' % i]=df.loc[df.SERIAL==serial[i]]
    a.append('serial%s' % i)
    

k=pd.concat(a,axis=1,join='inner')

I have tried like this but it is not working because variable "a" is the list of strings, so concat is throwing error and not sure how to rename name column names[Year to 2020,2019]in the loop as pictured. can you help to figure out

I'mahdi
  • 23,382
  • 5
  • 22
  • 30
jagan k
  • 33
  • 7
  • 2
    This operation is known as pivot: `k = df.pivot_table(index=['Country', 'state', 'sex', 'dist'], columns='Year', values='population').rename_axis(columns=None).reset_index()` (loop is not necessary this will work regardless of number of years.) – Henry Ecker Aug 23 '21 at 02:49
  • 1
    Thank you so much Henry Ecker for the simple and easy solution – jagan k Aug 23 '21 at 03:52

0 Answers0