13

I am using dplyr 1.0.6 and R 4.1.0 and I wrote 2 functions as follows:

AllCustomersList <- loadAllCustomersData()

CouldJoinByNationalID <- matchCustomersByNationalCode(AllCustomersList = AllCustomersList)

loadAllCustomersData() returns a list of two data frames, then the matchCustomersByNationalCode tries to execute a semi_join on those two data.frame as follows:

matchCustomersByNationalCode <- function(AllCustomersList) {
  
  FDCustomers <- AllCustomersList$FDCustomers
  Customers <- AllCustomersList$Customers
  
  semi_join(x = FDCustomers, y = Customers, by = c("NationalID" = "NationalCode"), na_matches = "never") %>% 
    pull(NationalID) %>% 
    return()
}

Actully this is just a wrapper for semi_join as matter of naming. But it throughs an error that says :

Error: x and y must share the same src, set copy = TRUE (may be slow).

Run rlang::last_error() to see where the error occurred.

Called from: signal_abort(cnd)

could anyone help with this?

Ali Sadeghi Aghili
  • 524
  • 1
  • 3
  • 15

2 Answers2

3

thanks to walter and Martin Gal I tried to make a reproducible example and it worked! So I checked the class of both data.frames and it says those are both data.frames. But I converted them again to data.frame inside the match function and it worked! it is still odd to me but problem solved!

Ali Sadeghi Aghili
  • 524
  • 1
  • 3
  • 15
  • 14
    It would be really great if you took the time to write the whole code down for other people to be able to rerun this completely on the fly for themselves without having to guess what exactly you are getting at. – Patrick Oct 13 '22 at 08:58
2

In case you wish to resolve the above stated error message you can follow the approach referenced in the documentation (https://dplyr.tidyverse.org/reference/mutate-joins.html). That is, in case of operating with two distinct data frames as input for your envisioned join-function, you can simply include the "copy" argument and set it to "TRUE". Please see the mock example that assumes two data frames (d_a, d_b) each having two columns to be used for the join-operation. Note that the copy-argument is included and set to TRUE:

(d_a) %>% 
  left_join(d_b,
            by=c('T1_ID_LOC','TIME'),
            copy = TRUE,
            keep = NULL)
GG-Delta
  • 21
  • 2