2

I have a dataframe data_df with n rows:

Rank       DutyCode
200        ABC
300        DEF
400        GHI

Then, I want to iteratively join them as one row, example:

Rank       DutyCode  Rank_1      DutyCode_1    Rank_2       DutyCode_2 
200        ABC        300        DEF           400        GHI

My current code :

column_data_df = data_df.iloc[[1]]
data_df.iloc = data_df.iloc.drop([0])

data_df = data_df.join(column_data_df, rsuffix='_1')

and only produce :

Rank       DutyCode  Rank_1      DutyCode_1    
200        ABC        300        DEF        

This only works for one time.. I want it to be dynamic for n number of rows.

EDIT

Another dataframe :

       Rank             DutyCode
{'a':'b', 'i':'j'}        ABC
{'a':'b', 'i':'j'}        DEF
{'a':'b', 'i':'j'}        GHI

To :

Rank.a  Rank.i  DutyCode  Rank_1.a  Rank_1.i   DutyCode_1
   b       j       ABC       b        j           DEF

I hope this makes sense..

cna
  • 317
  • 2
  • 9

2 Answers2

1

Use DataFrame.stack for Series with MultiIndex, convert to one row DataFrame by Series.to_frame and transpose and last flatten MultiIndex:

df = df.stack().to_frame().T
df.columns = df.columns.map(lambda x: f'{x[1]}_{x[0]}').str.replace('_0','')
print (df)
  Rank DutyCode Rank_1 DutyCode_1 Rank_2 DutyCode_2
0  200      ABC    300        DEF    400        GHI

Or for flatten use list comprehension:

df = df.stack().to_frame().T
df.columns = [f'{b}' if a == 0 else f'{b}_{a}' for a, b in df.columns]

print (df)
  Rank DutyCode Rank_1 DutyCode_1 Rank_2 DutyCode_2
0  200      ABC    300        DEF    400        GHI

EDIT:

import ast

#if necessary
#df['Rank'] = df['Rank'].apply(ast.literal_eval)

df = pd.json_normalize(df.pop('Rank')).add_prefix('Rank.').join(df)
print (df)
  Rank.a Rank.i DutyCode
0      b      j      ABC
1      b      j      DEF
2      b      j      GHI

df = df.stack().to_frame().T
df.columns = [f'{b}' if a == 0 else f'{b}_{a}' for a, b in df.columns]

print (df)
  Rank.a Rank.i DutyCode Rank.a_1 Rank.i_1 DutyCode_1 Rank.a_2 Rank.i_2  \
0      b      j      ABC        b        j        DEF        b        j   

  DutyCode_2  
0        GHI  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

This is not the cleanest way, but it works:

import pandas as pd
input = pd.DataFrame(data=[[200, 'ABC'], [300, 'DEF'],[400, 'GHI']],\
             columns=['Rank' ,'DutyCode'])

df = input.iloc[0]
for row in range(1,len(df)+1):
    df['Rank_' +str(row)] = input.loc[row]['Rank']
    df['DutyCode_' +str(row)] = input.loc[row]['DutyCode']
Oliver Prislan
  • 320
  • 4
  • 12
  • This is more intuitive. Thanks! – cna Oct 08 '20 at 07:33
  • 1
    I think loop in pandas is antipattern, so you can avoid it. Check [this](https://stackoverflow.com/a/55557758). – jezrael Oct 08 '20 at 07:42
  • @jezrael : thank you for the tip. I knew it's not pythonic (pandastic), for me it's just easier to understand - nevertheless good to know - if performance is needed. – Oliver Prislan Oct 08 '20 at 14:15