Importing multiple .csv files with variable column types into R

Question

How can I properly build an lapply to read (from out of one directory) all the .csv files, load all the columns as strings and then bind them into one data frame.

Per this, I have a way to get all the .csv files loaded and bound into a dataframe. Unfortunately they are getting hung up on the variablity of how the columns are getting type cast. Thus giving me this error:

Error: Can not automatically convert from character to integer in column

I have tried supplementing the code with the arguments for data type and am trying to just keep everything as characters; I am getting stuck now on being able to properly get my lapply 'loop' to effectively reference the subject of each cycle of its 'loop'.

srvy1 <- structure(list(RESPONSE_ID = 584580L, QUESTION_ID = 328L, SURVEY_ID = 2324L, 
           AFF_ID_INV_RESP = 5L), .Names = c("RESPONSE_ID", "QUESTION_ID", 
                                             "SURVEY_ID", "AFF_ID_INV_RESP"), class = "data.frame", row.names = c(NA, 
                                                                                                                  -1L))

srvy2 <- structure(list(RESPONSE_ID = 584580L, QUESTION_ID = 328L, SURVEY_ID = 2324L, 
           AFF_ID_INV_RESP = "bovine"), .Names = c("RESPONSE_ID", "QUESTION_ID", 
                                                   "SURVEY_ID", "AFF_ID_INV_RESP"), class = "data.frame", row.names = c(NA, 
                                                                                                                        -1L))    

files = list.files(pattern="*.csv")
tbl = lapply(files, read_csv(files, col_types = cols(.default = col_character()))) %>% bind_rows

Is there an easy fix for this that I can keep in tidyverse, or must I drop down a level and go into openly building the for loop myself - per this.

Most likely due to not providing a reproducible example, with data. You have some errors in your code. It should be `lapply(fies, read_csv, col_types = cols(.default = col_character()))` — Jake Kaupp, Nov 16 '16 at 19:08
I would supply `head(x,5)` of two of the files, create a list from those and `dput` the results and paste them here. — Jake Kaupp, Nov 16 '16 at 19:15
`tbl = lapply(files, read_csv(col_types = cols(.default = col_character())))` gets the following error _Error in inherits(x, "connection") : argument "file" is missing, with no default_ and this `tbl = lapply(files, read_csv(files, col_types = cols(.default = col_character())))` gets this _Error in switch(tools::file_ext(path), gz = gzfile(path, ""), bz2 = bzfile(path, : EXPR must be a length 1 vector In addition: Warning messages: 1: In if (grepl("\n", x)) { : the condition has length > 1 and only the first element will be used_ which typically means that I need a rowwise call. — leerssej, Nov 16 '16 at 19:20
The `lapply` should be the form `lapply(x, FUN, ...)` where ... is the arguments passed to FUN. You're filling the arguments within FUN. It should be `lapply(files, read_csv, col_types = cols(.default = col_character()))` — Jake Kaupp, Nov 16 '16 at 19:22
readr doesn't like the `lapply(files, read_csv, col_types = cols(.default = col_character())%>% bind_rows)` as it gives this warning: _Error: col_types must be NULL, a list or a string_ but it is very happy with the `tbl = lapply(files, read_csv, col_types = cols(.default = "c")) %>% bind_rows`. You're the bomb! Can you throw your comments into an answer? — leerssej, Nov 16 '16 at 19:31

score 9 · Accepted Answer · answered Nov 16 '16 at 19:32

9

The lapply should be the form lapply(x, FUN, ...) where ... is the arguments passed to FUN. You're filling the arguments within FUN. It should be lapply(files, read_csv, col_types = cols(.default = "c"))

If you like a tidyverse solution:

files %>%
  map_df(~read_csv(.x, col_types = cols(.default = "c")))

Which will bind the whole thing into a data frame at the end.

answered Nov 16 '16 at 19:32

Jake Kaupp

7,892
2
26
36

thank you. I have been struggling with this for hours. – Nova Feb 27 '19 at 14:30

Importing multiple .csv files with variable column types into R

1 Answers1

Linked