1

I have 2 dataframes with the same headers similar to that.

Jul    X1   X2  X3  X4    X5

The sizes of each data are:

D1:

nrowA=2191, ncolA= 51.

nrowB=366, ncolB= 51.

Actually, I have exacly the same columns in each dataframe. The first dataframe is daily data of temperature for 04 years while the second data is a "reference". I want to do (A-B) where the first column (Jul) of each dataframe does match. Could you please advise me with a method to do that in AVOIDING loops. Cheers

  • You should provide minimal, reproducible example with your question. Have a look at https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and add relevant information. – MKR Mar 28 '18 at 21:15
  • Have a look at `dplyr::anti_join()'. Probably you can use `dplyr::anti_join(A, B, by="Jul")` – MKR Mar 28 '18 at 21:18

1 Answers1

1

If you know SQL there is a library that allows you to compute SQL queries:

D1 <- data.frame(a = 1:5, b=letters[1:5])
D2 <- data.frame(a = 1:3, b=letters[1:3])

require(sqldf)

a1NotIna2 <- sqldf('SELECT * FROM D1 WHERE (a NOT IN (SELECT a FROM D2))')
Frostic
  • 680
  • 4
  • 11