1

I have a dataframe:

id   year      value
a1   2020-K       A
a1   2020-1       B
a1   2020-2       A
a1   2021-K       A
a1   2021-7       C
a1   2022-K       C
a1   2022-1       C
a1   2022-2       A
b1   2020-K       D
b1   2020-1       C
b1   2020-2       A
b1   2021-K       D
b1   2021-1       D
b1   2021-2       C
b1   2022-K       C
b1   2022-1       B
b1   2022-2       A

I want to turn each unique value in "year" column into columns and put "values" in them. desired result is:

id    2020-K   2020-1  2020-2  2020-7  2021-K   2021-1  2021-2  2021-7  2022-K   2022-1  2022-2  2022-7
a1    A         B        A       NaN     A      NaN     NaN      C       C        C         A     NaN

b1    D         C        A       NaN     D      D        C       NaN     C        B         A     NaN

How could I do that? Is there any function for such transposing?

gh1222
  • 657
  • 2
  • 8
  • 1
    Pandas transpose : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.transpose.html – Lanre Aug 08 '22 at 12:30
  • 1
    Does this answer your question? [How can I pivot a dataframe?](https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe) – Chris Aug 08 '22 at 12:30

1 Answers1

0
output = df.pivot(index='id', columns='year', values='value')
output
###
year 2020-1 2020-2 2020-K 2021-1 2021-2 2021-7 2021-K 2022-1 2022-2 2022-K
id                                                                        
a1        B      A      A    NaN    NaN      C      A      C      A      C
b1        C      A      D      D      C    NaN      D      B      A      C

If you need 20XX-K 20XX-0 20XX-1 20XX-2 20XX-7 orders

df2 = pd.DataFrame(np.array(np.meshgrid(['a1','b1'],['2020','2021','2022'],['0','1','2','7'])).T.reshape(-1,3))
df2.sort_values(by=[1,0,2],inplace=True)
df2[3] = df2[1].str.cat(df2[2],sep='-')
df2.drop(columns=[1,2],inplace=True)
df2.reset_index(drop=True,inplace=True)
df2.rename(columns={0:'id',3:'year'},inplace=True)
df2['year'] = df2['year'].str.replace(r'0$','K', regex=True)
df3 = df2.merge(df,on=['id','year'], how='left').pivot(index='id', columns='year', values='value')
df3 = df3[df2['year'].unique()]
df3
###
year 2020-K 2020-1 2020-2 2020-7 2021-K 2021-1 2021-2 2021-7 2022-K 2022-1  2022-2 2022-7
id                                                                           
a1        A      B      A    NaN      A    NaN    NaN      C      C      C       A    NaN  
b1        D      C      A    NaN      D      D      C    NaN      C      B       A    NaN

Baron Legendre
  • 2,053
  • 3
  • 5
  • 22