-1
corr <- function(directory, threshold = 0) {
  setwd("c:/users/hp1/desktop")
  files_full <- list.files(directory,full.names = TRUE)
  cr <- data.frame()
  j = 1
  for(i in 1:332)
  {
    y <- read.csv(files_full[i])
    z <- y[complete.cases(y),]

    if (nrow(z) > threshold){
      cr[j] <- cor(z[,'sulfate'], z[,'nitrate'], method = "pearson")

      j = j+1
      }


    }
  cr

  }

it's showing the following error: Error in [<-.data.frame(*tmp*, j, value = -0.222552560758546) : replacement has 1 row, data has 0 I was expecting as j increments, values would get added to the cr dtaframe. however that is not happening. please suggest necessary changes

1 Answers1

0

You could try something like this. If you provide a reproducible example I can show you how to clean the result. sapply will try to simplify the result, but you can stop it by specifying simplify = FALSE and remove unwanted list elements.

setwd("c:/users/hp1/desktop") # I would use this outside a function

corr <- function(directory, threshold = 0) {
  files_full <- list.files(directory, full.names = TRUE)

  sapply(files_full, FUN = function(x) {
    y <- read.csv(x)
    z <- y[complete.cases(y),]

    if (nrow(z) > threshold){
      out <- cor(z[,'sulfate'], z[,'nitrate'], method = "pearson")
    } else {
      return(NA) # or some other way you want to handle the result
    }
  })
}
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • thanks for your reply. Could you tell me what went wrong in my code? and also where will all the correlations be stored? – Shivam Munshi Jul 17 '15 at 10:39
  • @ShivamMunshi once you run the function `myresult <- corr(...)`, the result will be in `myresult`. – Roman Luštrik Jul 17 '15 at 10:41
  • dont you think 'out' will get rewritten after every loop? – Shivam Munshi Jul 17 '15 at 10:42
  • i ran your code with > cr <- corr("specdata",400) > head(cr) this did not produce the expected output – Shivam Munshi Jul 17 '15 at 10:44
  • @ShivamMunshi you will probably have to adapt the output. If you provide a reproducible example and show what the desired output should look like, I'll be more than happy to help out. – Roman Luštrik Jul 17 '15 at 10:54
  • the else statement is giving out a NULL when the if statement isnt satisfied. In the output, we donot want anything is the IF statement is not satisfied $`specdata/001.csv` NULL $`specdata/002.csv` [1] -0.01895754 $`specdata/003.csv` NULL $`specdata/004.csv` [1] -0.04389737 $`specdata/005.csv` [1] -0.06815956 $`specdata/006.csv` NULL is the output I'm getting and the expected output is ## [1] -0.01896 -0.14051 -0.04390 -0.06816 -0.12351 -0.07589 thanks – Shivam Munshi Jul 17 '15 at 11:08
  • @ShivamMunshi please edit your question that includes some simulated data. I suggest you have a look at [this question](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for some tips on how to do that. – Roman Luštrik Jul 17 '15 at 12:42