0

Currently I am using the below to read in ~7-10 files to the R console all at once.

library(magrittr)
library(feather)

list.files("C:/path/to/files",pattern="\\.feather$") %>% lapply(read_feather)

How can I pipe these into independent data objects based on their unique file names?

ex.

accounts_jan.feather
users_jan.feather
-> read feather function -> hold in working memory as:
accounts_jan_df
users_jan_df

Thanks.

sgdata
  • 2,543
  • 1
  • 19
  • 44
  • 5
    Check `?list2env()` – Vlo Jan 31 '17 at 20:29
  • Have a read of gregor's answer to [this post](http://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames). It is generally a good idea to keep similar objects in a list. – lmo Jan 31 '17 at 20:30
  • @lmo These might not be similar objects (users vs accounts) – Frank Jan 31 '17 at 20:36
  • The objects are all of the same file type (`feather`) but do not have the same dimensions at all. Some are 3 variables with 150 rows, others are 6 variables with 700k rows. I could do `*file name*.feather <- read_feather("C:/path/to/dir")` for each, but I'd prefer to add a single call which will gather all `feather` files in a directory and read them into my env. – sgdata Jan 31 '17 at 20:40
  • @Frank fair point, I suppose it would depend on the use case. For example, a list of monthly user data.frames and a separate list of monthly account data.frames may be ideal. – lmo Jan 31 '17 at 20:41

1 Answers1

3

This seems like a case of trying to do too much with a pipe (https://github.com/hrbrmstr/rstudioconf2017/blob/master/presentation/writing-readable-code-with-pipes.key.pdf). I would recommend segmenting your process a little:

# Get vector of files
files <- list.files("C:/path/to/files", pattern = "\\.feather$")

# Form object names
object_names <- 
  files %>%
  basename %>%
  file_path_sans_ext

# Read files and add to environment
lapply(files, 
       read_feather) %>%
  setNames(object_names) %>%
  list2env()

If you really must do this with a single pipe, you should use mapply instead, as it has a USE.NAMES argument.

list.files("C:/path/to/files", pattern = "\\feather$") %>%
  mapply(read_feather,
         .,
         USE.NAMES = TRUE,
         SIMPLIFY = FALSE) %>%
  setNames(names(.) %>% basename %>% tools::file_path_sans_ext) %>%
  list2env()

Personally, I find the first option easier to reason with when I go to do debugging (I'm not a fan of pipes within pipes).

Benjamin
  • 16,897
  • 6
  • 45
  • 65