0

I have two dataframes which are uneven in the number of rows. Now I want to add them horizontally by aligning the second dataframe based on a key ("flag"). However, the flag serves merely as a connector at a specific row to the first (base) df which means the second dataframe should be pasted at that connector point. Please see visual for what I mean in case it is not clear.

I tried looking into merge, concat, join etc but it will join it does not seem quiet like what I am looking for.

enter image description here

titu84hh
  • 133
  • 14
  • @user2552108 you basically need to connect the two dataframe on where the rows are yellow which will result on the output Final – titu84hh Oct 07 '21 at 07:33
  • What if you have multiple 1's in each dataframe how will second dataframe be placed in that case? – Muhammad Hassan Oct 07 '21 at 07:52
  • @Muhammadhassan there will NEVER be multiple 1's! Only one-time this can occur. – titu84hh Oct 07 '21 at 07:54
  • @titu84hh Please accept my answer below if it answers your question. Credit to https://stackoverflow.com/questions/43408621/add-a-row-at-top-in-pandas-dataframe – EBDS Oct 07 '21 at 08:08

2 Answers2

1
dif = df1['flag'].idxmax() - df2['flag'].idxmax()
df2.index = df2.index + dif
df1.merge(df2,how='outer',left_index=True,right_index=True)

enter image description here

Can make use of the above idea. Need to clean up the column names and drop the extra column. Works if dif is -ve.

Drop 1st 3 rows of df1

enter image description here

EBDS
  • 1,244
  • 5
  • 16
0

You can try:

df_base = pd.DataFrame(data={'flag':[0,0,0,0,1,0,0],
                             'transaction_value':[1,1,2,2,5,6,9]})
df_group2 = pd.DataFrame(data={'flag':[0,0,1,0,0],
                               'transaction_value':[1,1,2,2,5]})

diff = df_base['flag'].argmax() - df_group2['flag'].argmax()
df_group2.index = df_group2.index + diff
print(df_base.join(df_group2[['transaction_value']], rsuffix='_group2'))

Output:

   flag  transaction_value  transaction_value_group2
0     0                  1                       NaN
1     0                  1                       NaN
2     0                  2                       1.0
3     0                  2                       1.0
4     1                  5                       2.0
5     0                  6                       2.0
6     0                  9                       5.0
Muhammad Hassan
  • 4,079
  • 1
  • 13
  • 27