0

I am trying to implement the example given here: https://cran.r-project.org/web/packages/multidplyr/vignettes/multidplyr.html

I however get the following error when I get to the point where I need to partition the data using ether method 1 or 2. I have tried re-installing Rcpp package and still doesn't work.

Error in qs::qsave(values, path, preset = "fast", check_hash = FALSE, : function 'Rcpp_precious_remove' not provided by package 'Rcpp'

Below is the code sample:

library(multidplyr)

library(dplyr, warn.conflicts = FALSE)

library(nycflights13)

###Creating a cluster
cluster <- new_cluster(2)

####Method 1. Add dataPartition not working. Investigate why. Use direct method instead
# flights1 <- flights %>% group_by(dest) %>% partition(cluster)

# Method 2 To show how that might work, I’ll first split flights up by month and save as csv files:
path <- tempfile()
dir.create(path)

flights %>% 
  group_by(month) %>% 
  group_walk(~ vroom::vroom_write(.x, sprintf("%s/month-%02i.csv", path, .y$month)))

# Now we find all the files in the directory, and divide them up so that each worker gets (approximately) the same number of pieces:

files <- dir(path, full.names = TRUE)
cluster_assign_partition(cluster, files = files)


# Then we read in the files on each worker and use party_df() to create a partitioned dataframe:

cluster_send(cluster, flights2 <- vroom::vroom(files))

flights2 <- party_df(cluster, "flights2")


###dplyr verbs. 

df <- flights1 %>%
  summarise(dep_delay = mean(dep_delay, na.rm = TRUE)) %>%
  collect()
Kishron
  • 1
  • 1
  • 1
    Update your version of Rcpp with `install.packages("Rcpp")` – MrFlick Aug 04 '21 at 21:57
  • Lightspeed, @MrFlick, lightspeed, ... :) – Dirk Eddelbuettel Aug 04 '21 at 21:58
  • Thanks @MrFlick but I have tried doing that already and I still get the same error. – Kishron Aug 04 '21 at 22:15
  • 1
    What version of Rcpp are you currently using? In fact, please show the version numbers for all packages involved using `sessionInfo()` – MrFlick Aug 04 '21 at 22:19
  • I am using version 4.1.0. SessionInfo() print out below R version 4.1.0 (2021-05-18) other attached packages: [1] nycflights13_1.0.2 dplyr_1.0.7 multidplyr_0.1.0 Rcpp_1.0.6 – Kishron Aug 04 '21 at 22:33
  • You are not using the most recent version of Rcpp. The most recent version is `1.0.7` (repleased 2021-07-07). If you did run `install.packages("Rcpp")` then you most likely got an error that prevented the actual update. Make sure to check the output for that command. – MrFlick Aug 04 '21 at 22:38
  • Thank you @MrFlix. All resolved now. I was having problems updating to version 1.0.7 and I found thi link helpful as well. https://stackoverflow.com/questions/45922418/error-in-install-packages-cannot-remove-prior-installation-of-package-dbi?noredirect=1&lq=1 – Kishron Aug 05 '21 at 15:14

0 Answers0