0

I want to use a user-defined function with the outer function. I want to use every combination of the two vectors' elements as arguments for my function myfun.

v1= c(30, 60, 100)
v2 = c(30, 60, 100)

myfun = function(x, y){
  rt = unlist(Vectorize(retimes::rexgauss)(n = x, tau = y, mu = 500, sigma = 50))
  ks = ks.test(rt, "pnorm", mean(rt), sd(rt))$p.value
  shap = shapiro.test(rt)$p.value
  z = skew(rt) / sqrt(6/length(rt))
  ztest = pnorm(-abs(z))*2
  results = c(ks, shap, ztest)
  names(results) = c("ks", "shapiro", "ztest")
  return(results)
}

outer(v1, v2, myfun)

If I do it like this, I get this error:

Error in dim(robj) <- c(dX, dY) : dims [product 9] do not match the length of object [3]

I want to avoid looping over all the elements of my two vectors. How can I use the outer function here? How do I vectorize my udf properly?

Max J.
  • 153
  • 12
  • Add desired result. Read the following https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Hector Haffenden Oct 21 '18 at 23:17
  • ["dims \[product xx\] do not match the length of object \[xx\]" error in using R function `outer`](https://stackoverflow.com/q/52317124/4891738); [Why doesn't outer work the way I think it should (in R)?](https://stackoverflow.com/q/18110397/4891738) – Zheyuan Li Oct 21 '18 at 23:45
  • 1
    Max, the purpose of `outer` is to produce the outer-product of two vectors. The [definition of the outer product](https://en.wikipedia.org/wiki/Outer_product) says that two vectors of length `n` and `m` will produce a matrix of dimensions `n x m`. Each of your `x` and `y` will be a vector of length `n*m` (9 here) and *must* output a vector of the same length. – r2evans Oct 21 '18 at 23:48
  • @r2evans @李哲源 I understand that `outer` might be the wrong function for my task. Is there an alternative function which could help here? I tried using `sapply(v2, function(y) sapply(v1, function(x) myfun(x, y)))` but it's not too elegant and in its output it is hard to see which row corresponds to which combination of the vectors' elements. – Max J. Oct 22 '18 at 08:01
  • What is your expected output? Just a vector of three numbers? Are you expecting your function to be called once for each combination, or once with all combinations? – r2evans Oct 22 '18 at 13:35

1 Answers1

4

To use outer, some basic "requirements":

  1. the function will take two vectors all at once (as shown below); whether it chooses to do vectorized work on them or work on them individually is up to you;

  2. it must return a vector of the same length as x (and y); and

  3. you must expect the output as a matrix of dimensions length(x),length(y).

Interpreting that these are not all true for you, we move on. "The right function" depends on how you want the model to be run. A companion function to outer is expand.grid (and tidyr::crossing, the tidyverse version), in that it creates the same combinations of the supplied vectors. For instance:

outer(c(30,60,90), c(30, 60, 100), function(x,y) {browser();1;})
# Called from: FUN(X, Y, ...)
# Browse[2]> 
x
# [1] 30 60 90 30 60 90 30 60 90
# Browse[2]> 
y
# [1]  30  30  30  60  60  60 100 100 100

and

eg <- expand.grid(x=c(30,60,90), y=c(30, 60, 100))
eg
#    x   y
# 1 30  30
# 2 60  30
# 3 90  30
# 4 30  60
# 5 60  60
# 6 90  60
# 7 30 100
# 8 60 100
# 9 90 100

(which you can then access as eg$x and eg$y).

Some options:

  1. if you want your function to be called once (as with outer) with two arguments, and it will figure out what to do:

    eg <- expand.grid(x=c(30,60,90), y=c(30, 60, 100))
    do.call("myfunc", eg)
    

    Note that if given character arguments, it will (similar to data.frame) create factors by default. It does accept the stringsAsFactors=FALSE argument.

  2. if you want your function to be called for each pair of the vectors (so 9 times in this example), then do one either

    myfunc(eg$x, eg$y)
    

    if the number of vectors is known. If not, then using eg from above, then

    do.call("mapply", c(myfunc, eg))
    

    should work. Depending on the output, you can preclude it from "simplifying" the output (i.e., force a list output) with

    do.call("mapply", c(myfunc, eg, SIMPLIFY=FALSE))
    
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • I had been experimenting with `expand.grid` but was missing the `do.call` part. Your second approach worked for me! Thank you. – Max J. Oct 23 '18 at 15:10
  • It can seem odd to call `do.call("mapply", ...)`, but it's a good way (imo) to deal with variable-length lists, as `eg` can be if the number of columns is variable. – r2evans Oct 23 '18 at 15:27