2

I have a function to compute the correlation of matrix of both categorical and continuous variables:

correlation <- function(matrix, ...) {
    xx <- do.call(rbind, lapply(colnames(mtrc), function(ex_i) {
        ty_i <- wtype(matrix, ex_i)
        yy <- sapply(colnames(mtrc), function(ex_j) {
            ty_j <- wtype(matrix, ex_j)

            if(ty_i == "numeric" & ty_j == "numeric") {
                cor(mtrc[ , c(ex_i, ex_j)], ...)[1, 2]
            } else if(ty_i == "factor" & ty_j == "factor") {
                cramersV(table(mtrc[ , c(ex_i, ex_j)]), ...)
            } else {
                fm <- paste(ex_i, "~", ex_j)
                if(ty_i == "factor") {
                    fm <- paste(ex_j, "~", ex_i)
                }
                fm <- lm(fm, data = mtrc[ , c(ex_i, ex_j)], ...)
                lm.beta(fm)
            }
        })
        names(yy) <- colnames(mtrc)
        yy
    }))
    rownames(xx) <- colnames(mtrc)
    xx
}

My question is how to pass, properly, the argument ... to cor, cramerV and lm. Since the argument's names of these three functions do not match if the user gives an argument for cor and there is a categorical variable in the matrix, the cramerV or lm raises an error (unused argument...).

So... I'm open to any solution or idea you can have.

carlesh
  • 537
  • 1
  • 4
  • 17
  • 2
    Do `dots <- list(...)` and check, e.g., which of the elements match with `formalArgs(lm)` and subset the list to these. Add the `fm` and `data` parameters to this list and call `lm` using `do.call`. I would show you how, but you don't provide a reproducible example. – Roland Jun 22 '16 at 07:15
  • You might also find the functions `dots` or `named_dots` in the `pryr` package useful for this kind of task. – David_B Jun 22 '16 at 07:20

1 Answers1

2

I did not realize that there was an excellent question by Richard Scriven at 2014: Split up `...` arguments and distribute to multiple functions, when I made my answer below. So yes, this is a duplicated question. But I will keep my answer here, as it represents what I thought (and what I think).


Original answer

I think this is better, by giving your correlation function a finer control:

correlation <- function(matrix, cor.opt = list(), cramersV.opt = list(), lm.opt = list()) {
    xx <- do.call(rbind, lapply(colnames(mtrc), function(ex_i) {
        ty_i <- wtype(matrix, ex_i)
        yy <- sapply(colnames(mtrc), function(ex_j) {
            ty_j <- wtype(matrix, ex_j)

            if(ty_i == "numeric" & ty_j == "numeric") {
                do.call("cor", c(list(x = mtrc[ , c(ex_i, ex_j)]), cor.opt))[1, 2]
            } else if(ty_i == "factor" & ty_j == "factor") {
                do.call("cramersV", c(list(x = table(mtrc[ , c(ex_i, ex_j)])), cramersV.opt))
            } else {
                fm <- paste(ex_i, "~", ex_j)
                if(ty_i == "factor") {
                    fm <- paste(ex_j, "~", ex_i)
                }
                fm <- do.call("lm", c(list(formula = fm, data = mtrc[ , c(ex_i, ex_j)]), lm.opt))
                lm.beta(fm)
            }
        })
        names(yy) <- colnames(mtrc)
        yy
    }))
    rownames(xx) <- colnames(mtrc)
    xx
}

You can pass different arguments intended for different functions via arguments cor.opt, cramersV.opt and lm.opt. Then, inside your function correlation, use do.call() for all relevant function call.


Comment

I like @Roland's idea. He chooses to use ..., while splitting list(...) according to formal arguments of different functions. On the other hand, I have asked you to manually specify those arguments into different lists. In the end, both of us ask you to use do.call() for function call.

Roland's idea is broadly applicable, as it is easier to extend to more functions requiring ....

Community
  • 1
  • 1
Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248