0

So I have been trying to find a solution to merge 2 data frames. the primary keys are matching sometimes and sometimes they are on DF1 but not DF2 and DF2 but not DF1. I want all rows and ID's to be displayed and to give a '-' for values that do not appear. For values that do appear in both I want them to be concatenated in the same row as shown in DF3:

    >>>DF1                             >>>DF2          
    ID  Apples  Oranges  Transaction  ID  Milk  Meat Transaction
    1   2       1        Credit       1   2     3    Cash
    3   3       4        Debit        2   7     2    Cash
    5   2       3        Cash         6   2     3    Cash

    >>>DF3
    ID  Apples  Oranges Transaction1  Milk  Meat Transaction2
    1   2       1       Credit        2     3    Cash
    2   -       -       -             7     2    Cash
    3   3       4       Debit         -     -    -
    5   2       3       Cash          -     -    -
    6   -       -       -             2     3    Cash
eweiss1997
  • 17
  • 4
  • 1
    I think df1.merge(df2, how='outer') does what you want. You will have NaN but you can then set them to '-' using fillna. – HGLR Oct 11 '19 at 15:27

1 Answers1

0

Looks like a basic outer join, and a fillna() to replace the NaN values with '-'

DF1.merge(df2, on='ID', how='outer').fillna('-').sort_index()

ID  Apples  Oranges Transaction_x   Milk    Meat    Transaction_y
1   2       1       Credit          2       3       Cash
2   -       -       -               7       2       Cash
3   3       4       Debit           -       -       -
5   2       3       Cash            -       -       -
6   -       -       -               2       3       Cash
G. Anderson
  • 5,815
  • 2
  • 14
  • 21