-7

I have the following situation,2 huge dataframes X and Y (the rownumber is about 13 millions per dataframe and the columns are 11 for each dataframe) and I need to merge them in a specific way. The X dataframe example is

    A   1   2   3
    B   3   2   4
    C   1   6   8

The Y dataframe is

    A   9   1   8
    B   3   1   7
    D   2   9   4

I have to mix them with the following logic: If the first element of the row in Y is present in X then i have to append it to the row If the first element of the row in Y is not present in X then i have to append zeroes and then append the Y data For all the X rows not present in Y I have to append then zeroes The mix result should be like this:

    A   1   2   3   9   1   8       I found A in Y and I appended
    B   3   2   4   3   1   7       I found B in Y and I appended
    C   1   6   8   0   0   0       I didn't found C in Y and added 0
    D   0   0   0   2   9   4       I didn't found D in X and added 0 then appended C

I tried to go row by row but it takes ages and I need a one shot or double shot (double instruction ) solution...

Thanks

joran
  • 169,992
  • 32
  • 429
  • 468
Alex Fort
  • 93
  • 6

1 Answers1

2

without a reproducible example I can't test this, but I think you want:

library(dplyr)
z<-full_join(x,y, by=FirstColumn)
z[is.na(Z)]<-0

this assumes there are no NA's in the original data.

joran
  • 169,992
  • 32
  • 429
  • 468
John Paul
  • 12,196
  • 6
  • 55
  • 75