join table horizontally by loop based on rows in a dataframe

Question

I have a dataframe data_df with n rows:

Rank       DutyCode
200        ABC
300        DEF
400        GHI

Then, I want to iteratively join them as one row, example:

Rank       DutyCode  Rank_1      DutyCode_1    Rank_2       DutyCode_2 
200        ABC        300        DEF           400        GHI

My current code :

column_data_df = data_df.iloc[[1]]
data_df.iloc = data_df.iloc.drop([0])

data_df = data_df.join(column_data_df, rsuffix='_1')

and only produce :

Rank       DutyCode  Rank_1      DutyCode_1    
200        ABC        300        DEF

This only works for one time.. I want it to be dynamic for n number of rows.

EDIT

Another dataframe :

       Rank             DutyCode
{'a':'b', 'i':'j'}        ABC
{'a':'b', 'i':'j'}        DEF
{'a':'b', 'i':'j'}        GHI

To :

Rank.a  Rank.i  DutyCode  Rank_1.a  Rank_1.i   DutyCode_1
   b       j       ABC       b        j           DEF

I hope this makes sense..

jezrael · Accepted Answer · 2020-10-08T09:20:37.593

1

Use DataFrame.stack for Series with MultiIndex, convert to one row DataFrame by Series.to_frame and transpose and last flatten MultiIndex:

df = df.stack().to_frame().T
df.columns = df.columns.map(lambda x: f'{x[1]}_{x[0]}').str.replace('_0','')
print (df)
  Rank DutyCode Rank_1 DutyCode_1 Rank_2 DutyCode_2
0  200      ABC    300        DEF    400        GHI

Or for flatten use list comprehension:

df = df.stack().to_frame().T
df.columns = [f'{b}' if a == 0 else f'{b}_{a}' for a, b in df.columns]

print (df)
  Rank DutyCode Rank_1 DutyCode_1 Rank_2 DutyCode_2
0  200      ABC    300        DEF    400        GHI

EDIT:

import ast

#if necessary
#df['Rank'] = df['Rank'].apply(ast.literal_eval)

df = pd.json_normalize(df.pop('Rank')).add_prefix('Rank.').join(df)
print (df)
  Rank.a Rank.i DutyCode
0      b      j      ABC
1      b      j      DEF
2      b      j      GHI

df = df.stack().to_frame().T
df.columns = [f'{b}' if a == 0 else f'{b}_{a}' for a, b in df.columns]

print (df)
  Rank.a Rank.i DutyCode Rank.a_1 Rank.i_1 DutyCode_1 Rank.a_2 Rank.i_2  \
0      b      j      ABC        b        j        DEF        b        j   

  DutyCode_2  
0        GHI

edited Oct 08 '20 at 09:20

answered Oct 08 '20 at 07:10

jezrael

822,522
95
1,334
1,252

1

Any other way to avoid lambda? – cna Oct 08 '20 at 07:11
@cna - Sure, added another solution – jezrael Oct 08 '20 at 07:14
1

hey wondering how this code can be improvised to also unpack column with dictionary. I can provide example.. – cna Oct 08 '20 at 09:05
@cna - OK, super, waiting for example – jezrael Oct 08 '20 at 09:06
Great!, will do now – cna Oct 08 '20 at 09:08
Getting `AttributeError: 'float' object has no attribute 'items'` at `json.normalize` – cna Oct 08 '20 at 09:32
Ah I think its because there are Nan rows – cna Oct 08 '20 at 09:37
1

Ok got it, `dropna()` solved the issue. Thank you very very very much :) – cna Oct 08 '20 at 09:41

score 1 · Answer 2 · answered Oct 08 '20 at 07:32

1

This is not the cleanest way, but it works:

import pandas as pd
input = pd.DataFrame(data=[[200, 'ABC'], [300, 'DEF'],[400, 'GHI']],\
             columns=['Rank' ,'DutyCode'])

df = input.iloc[0]
for row in range(1,len(df)+1):
    df['Rank_' +str(row)] = input.loc[row]['Rank']
    df['DutyCode_' +str(row)] = input.loc[row]['DutyCode']

answered Oct 08 '20 at 07:32

Oliver Prislan

320
4
12

This is more intuitive. Thanks! – cna Oct 08 '20 at 07:33
1

I think loop in pandas is antipattern, so you can avoid it. Check [this](https://stackoverflow.com/a/55557758). – jezrael Oct 08 '20 at 07:42
@jezrael : thank you for the tip. I knew it's not pythonic (pandastic), for me it's just easier to understand - nevertheless good to know - if performance is needed. – Oliver Prislan Oct 08 '20 at 14:15

join table horizontally by loop based on rows in a dataframe

2 Answers2