2

I have 8 CSV files all in the same directory, and need them importing into a single data frame in R. They all follow the same naming convention, "dataUK_1.csv", "dataUK_2.csv" etc., and have the exact same structure in terms of columns.

I've managed to create a vector of all the file names (including the full directory) by using:

files = list.files("/Users/iarwain/Data", pattern=".csv", full.names=T)

I'm just not sure how to pass these names to the read.csv command so that it loops 8 times, importing each file and adding its content as new rows into a single data frame, so that the end result is one data frame containing all rows of data from the 8 CSVs.

Thanks!

Iarwain
  • 47
  • 1
  • 4
  • Does this answer your question? [How do I make a list of data frames?](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames) – tjebo Jan 23 '21 at 13:53

2 Answers2

3

You don't want a loop. You want lapply.

file_list <- list.files("/Users/iarwain/Data", pattern=".csv", full.names=T)


combined_files <- do.call("rbind", lapply(file_list, read.csv))

Translation: apply the function read.csv over each item in the list file_list. The output is a list. Call the function rbind on all of the output, and assign it to combined_files

Matt74
  • 729
  • 4
  • 8
  • For some reason it's prompting me for some extra input (the > turns to a +) - even when I change files to file_list (which I assume is what you meant in the second line) – Iarwain Jan 30 '15 at 22:24
  • Oh dear it didn't even occur to me that it was only a missing bracket, it's been one of those days... thanks so much for your help! – Iarwain Jan 30 '15 at 23:03
1

In tidyverse you can just add a pipe and a map_df()

file_list <- list.files("/Users/iarwain/Data", pattern=".csv", full.names=T) %>%
    map_df(read_csv(.))

Specifically, as Hadley describes here (about halfway down):

map_df(x, f) is effectively the same as do.call("rbind", lapply(x, f)) but under the hood is much more efficient.

and a thank you to Jake Kaupp for introducing me to map_df() here.

Community
  • 1
  • 1
leerssej
  • 14,260
  • 6
  • 48
  • 57