1

I am writing the following function that creates a correlation matrix and finds highly correlated (>0.75) values:

var.cor <- function(data, cols){
  cor.mat <- cor(data [, cols])
  cor.mat <- round(cor.mat, 2)
  high.corr <- findCorrelation(cor.mat, cutoff = 0.75)
  print(cor.mat)
  print(high.corr)
} 

I want to give the function a range of column numbers (i.e., var.cor(data =dat, 10:20) will run the function for columns 10:20. what is the correct way to specify cols in the second line of the function? when I run var.cor("dat1", 10:20) I get an error message: Error in data[, cols] : incorrect number of dimensions

Ryan
  • 1,048
  • 7
  • 14
  • Could you please provide a [reproducable example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – Eric May 22 '20 at 12:36

2 Answers2

0

Your data argument just provides the string, like "mtcars" in the example below. Use get to get the object with this name from the .GlobalEnv. Example:

var.cor <- function(data, cols){
  cor.mat <- cor(get(data, envir=.GlobalEnv)[, cols])
  cor.mat <- round(cor.mat, 2)
  high.corr <- caret::findCorrelation(cor.mat, cutoff = 0.75)
  print(cor.mat)
  print(high.corr)
} 
var.cor("mtcars", cols=1:2)
#       mpg   cyl
# mpg  1.00 -0.85
# cyl -0.85  1.00
# [1] 2
jay.sf
  • 60,139
  • 8
  • 53
  • 110
0

You should give us some information on your variable 'data' (it seems not to have 10 columns). With one alteration (see below) your code works fine for me using the built-in iris dataset:

var.cor <- function(data, cols){
  cor.mat <- cor(data [, cols])
  cor.mat <- round(cor.mat, 2)
  high.corr <- cor.mat > 0.75
  print(cor.mat)
  print(high.corr)
} 
> var.cor(iris, 1:3)
             Sepal.Length Sepal.Width Petal.Length
Sepal.Length         1.00       -0.12         0.87
Sepal.Width         -0.12        1.00        -0.43
Petal.Length         0.87       -0.43         1.00
             Sepal.Length Sepal.Width Petal.Length
Sepal.Length         TRUE       FALSE         TRUE
Sepal.Width         FALSE        TRUE        FALSE
Petal.Length         TRUE       FALSE         TRUE

You don't share your findCorrelation function, but it seems to just be a filter (which I have added as high.corr <- cor.mat > 0.75 above.

In short, the actual answer to your question seems to be that your 'data' variable is not the shape you think it is.

randr
  • 255
  • 1
  • 7