0

I`ve 70 csv files with the same columns in a folder, each of them are 0.5 GB. I want to import them into a single dataframe in R.

Normally I import each of them correctly as below:

df <- read_delim("file.csv", 
"|", escape_double = FALSE, col_types = cols(pc_no = col_character(), 
    id_key = col_character()), trim_ws = TRUE)

To import all of them, coded like that and error as follows: argument "delim" is missing, with no default

tbl <-
list.files(pattern = "*.csv") %>% 
map_df(~read_delim("|", escape_double = FALSE, col_types = cols(pc_no = col_character(), id_key = col_character()), trim_ws = TRUE))

With read_csv, imported but appears only one column which contains all columns and values.

 tbl <-
 list.files(pattern = "*.csv") %>% 
 map_df(~read_csv(., col_types = cols(.default = "c")))
kimi
  • 525
  • 5
  • 17
  • Well the first positional parameter to `read_delim` is `file` not `delim` so perhaps you should name the parameter or actually pass `.x` or `.` to the first parameter. – hrbrmstr Oct 19 '18 at 16:21
  • It seems like others have answered your direct question, but I would also encourage you to consider the fread function from the data.table library. In my experience fread is much faster at reading in large files, and you can immediately convert it to a data.frame to suit your needs. – Jay Oct 19 '18 at 17:57
  • https://stackoverflow.com/a/72929493/1563960 – webb Jul 10 '22 at 15:06

1 Answers1

0

In your second block of code, you're missing the ., so read_delim is interpreting your arguments as read_delim(file="|", delim=<nothing provided>, ...). Try:

tbl <- list.files(pattern = "*.csv") %>% 
  map_df(~ read_delim(., delim = "|", escape_double = FALSE,
                      col_types = cols(pc_no = col_character(), id_key = col_character()),
                      trim_ws = TRUE))

I explicitly identified delim= here but it's not strictly necessary. Had you done that in your first attempt, however, you would have seen

readr::read_delim(delim = "|", escape_double = FALSE,
                  col_types = cols(pc_no = col_character(), id_key = col_character()),
                  trim_ws = TRUE)
# Error in read_delimited(file, tokenizer, col_names = col_names, col_types = col_types,  : 
#   argument "file" is missing, with no default

which is more indicative of the actual problem.

r2evans
  • 141,215
  • 6
  • 77
  • 149