1

I'm just learning R. I have 300 different files containing rainfall data. I want to create a function that takes a range of values (i.e., 20-40). I will then read csv files named "020.csv", "021.csv", "022.csv" etc. up to "040.csv".

Each of these files has a variable named "rainfall". I want to open each csv file, extract the "rainfall" values and store (append) them to some sort of object, like a data frame (maybe something else is better?). So, when I'm done, I'll have a data frame or list with a single column containing rainfall data from all processed files.

This is what I have...

rainfallValues <- function(id = 1:300) {
    df = data.frame()

      # Read anywhere from 1 to 300 files
    for(i in id) {
          # Form a file name
        fileName <- sprintf("%03d.csv",i)

        # Read the csv file which has four variables (columns). I'm interested in
        # a variable named "rainfall".
        x <- read.csv(fileName,header=T)

        # This is where I am stuck. I know how to exact the "rainfall" variable values from
        # x, I just don't know how to append them to my data frame.
    }
}
Randy Minder
  • 47,200
  • 49
  • 204
  • 358
  • You might find [this](http://stackoverflow.com/questions/23190280/issue-in-loading-multiple-csv-files-into-single-dataframe-in-r-using-rbind) helpful. Rather than enlarging a "data.frame" iteratively and re-allocating memory, you could get all nessecary data in a "list" and, then, use `do.call(rbind, list)` – alexis_laz Jun 29 '16 at 14:55

1 Answers1

3

Here is a method using lapply that will return a list of rainfalls

rainList <- lapply(id, function(i) {
       temp <- read.csv(sprintf("%03d.csv",i))
       temp$rainfall
})

To put this into a single vector:

rainVec <- unlist(rainList)

comment
The unlist function will preserve the order that you read in the files, so the first element of rainVec will be the first observation of the first rainfall column from the first file in id and the second element the second observation in that files and so on to the last observation of the last file.

lmo
  • 37,904
  • 9
  • 56
  • 69
  • How does this work? Each time the method is called the results are appended to the rainList vector? I cannot seem to make this work. The only results I get back are the results from the last file read, not all files. – Randy Minder Jun 29 '16 at 17:23
  • the rainList line should produce a list with the same length as the number of files in id. Each element of the list is a vector containing the rainfall column from each file in id. The rainVec line coerces that list into a single vector. – lmo Jun 29 '16 at 17:27
  • Oh, I see now. This isn't what I wanted. I didn't want, say, 10 different lists, with readings from the 10 files in the 10 lists. I wanted a SINGLE list with readings from all 10 files in the single list. – Randy Minder Jun 29 '16 at 17:31
  • By "lists" do you mean vector? R lists are the most general type of data.structure in R and can hold lots of stuff. Calling `unlist` on a list containing vectors will produce a single vector, which is what it sounds like you are looking for. Or are you looking for a data.frame at the end, where each rainfall vector is a column? – lmo Jun 29 '16 at 17:37
  • Thank you, I got it now. I'm still struggling with the "unique" terminology R uses. I've been a software developer for 25+ years and the terminology R uses seems mostly backwards to me. – Randy Minder Jun 29 '16 at 17:46
  • Yeah it can be odd get used to. Just a point, working with R lists with tools like `lapply` are worth the time investment, though it takes some adjustment. – lmo Jun 29 '16 at 17:51