0

I need to read data into R, using read_csv(), starting with list_files() creating a vector with full names of files, then set a tibble to NULL & loop over the entries in the vector I created using list_files, each time reading a file and attaching, with bind_rows(), the resulting tibble to the tibble set to NULL containing the data already read.

How would I do this in R doing it the exact way I described above?

  • I'm not sure what you are asking here. RStudio is just an IDE for running R. Whatever R code you have should work fine in RStudio. There would be no difference. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Mar 26 '21 at 16:29

2 Answers2

0

You don't need to create a NULL tibble to do that. Assuming that the csv have the same column names you can proceed as follows, just change the "folder/with/csv/data" with the path to the folder where the data is stored in your PC.

library(tidyverse)

#First create a vector with your csv paths
vector <- list.files("folder/with/csv/data", pattern = "*.csv", full.names = TRUE)

#Then you can read all the csv in a list
list <- map(vector, function(f) 
  read_csv(f)
)

#And finally you can bind the list of tables in a single one
data <- do.call(rbind, list)

#You can also write all the above in a pipe
data <- list.files("folder/with/csv/data", pattern = "*.csv", full.names = TRUE) %>% 
  map(., function(f) read_csv(f)) %>% 
  do.call(rbind, .)
ornaldo_
  • 125
  • 6
0

An optional approach to your problem would be this (if I understood it correctly):

library(stringr)
library(purrr)
library(io)
library(readr)

# path to your csv files folder 
path <- "C:/your_path_to_files_folder"

# get all file in the path folder and set full.names to TRUE to get not only the name but the whole path
file_vector <-  io::list_files(path = path, full.names = TRUE)

# just to be sure you are using only the ".csv" files in case something else is in the directory
file_vector_csvs <- file_vector[stringr::str_detect(file_vector, ".csv")]

# use map instead of for-loop => result is a list (also a new column with the original filename is generated as it might be important for later filtering/use)
purrr::map(file_vector_csvs, ~readr::read_csv(.x) %>% dplyr::mutate(file = .x))

# use map_df to get a data.frame (which is tabular and can be converted to a tibble directly) (also a new column with the original filename is generated as it might be important for later filtering/use)
purrr::map_df(file_vector_csvs, ~readr::read_csv(.x) %>% dplyr::mutate(file = .x))

For the last call to work all .csv files should have identical columns and column positions - that might or might not be your case

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
DPH
  • 4,244
  • 1
  • 8
  • 18