0

So I am trying to load multiple csv files into a project and figured I could use one array, say data_in, for all my csv files so I can reference them with say

data_in[,,1] ##first csv file
data_in[1,,2] ## first row of second csv file

And so on. To read the files I have this loop.

for (i in seq_along(names)) {
    file_name <- paste(names[i],".csv", sep = "")
    data_in[,,i] <- read.csv(file_name, header=T, sep= ",")
    }

But obviously I wouldn't be here if it worked. I'm not used to R so do I need to declare the dimensions of data_in before I load in data? Is there a way to read the data in by only using index for the csv file or do I have to use a 3D array? Sorry if this is sort of basic.

Any help is much appreciated. Have a nice day.

terra_rob
  • 13
  • 1
  • 3
  • Does `data_in <- lapply(names, read.csv)` do basically what you want? – Hugh Jun 26 '14 at 07:57
  • possible duplicate of [Read multiple CSV files into separate data frames](http://stackoverflow.com/questions/5319839/read-multiple-csv-files-into-separate-data-frames) – rrs Jun 26 '14 at 13:34

2 Answers2

1

To expand on Hugh's comment. You want list.files() and lapply which will read all your files in as a list of data.frames that you can then access using [[]]

files <- list.files(pattern="csv")
data_in <- lapply(files, read.csv)

read.csv is just read.table with header = T and sep ="," as default so you don't need to specify them.

Then to access them use [[]] e.g.

head(data_in[[1]])
JeremyS
  • 3,497
  • 1
  • 17
  • 19
1

If you're hell bent on an array, although I would suggest lapply for no other reason other than I'm more used to it (any maybe that arrays are more appropriate for (numeric) matrices), here's one approach

set.seed(357)
# note that these are actually matrices
df1 <- matrix(sample(1:100, 9), ncol = 3)
df2 <- matrix(sample(1:100, 9), ncol = 3)
df3 <- matrix(sample(1:100, 9), ncol = 3)

my.df <- list(df1, df2, df3) # this is what you get with list.files/lapply approach
my.df

out.array <- array(rep(NA, 3*3*3), dim = c(3, 3, 3))
out.array

for (i in 1:3) {
  out.array[,, i] <- my.df[[i]]
}
out.array

, , 1

     [,1] [,2] [,3]
[1,]   11   22   88
[2,]    6   63   70
[3,]   28   45   72

, , 2

     [,1] [,2] [,3]
[1,]  100   45   17
[2,]   62   22   30
[3,]   52   57   83

, , 3

     [,1] [,2] [,3]
[1,]   69   45   67
[2,]   43   36   65
[3,]   18    5   35
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197