Applying an existing multi-argument function to multiple dataframes, row by row, with a joint output dataframe

Question

I have a function taking four arguments,

h(a, b, c, d)

Where a and b are the i-th and the i+1-th row of df1 and c and d are the i-th and i+1-th row of df2, and the output has four variables and i-1 results.

The idea is the following: I want to use the function h to each combination of these four arguments where i is common, and so: - for the first iteration it will take the 1st and 2nd row of df1 and 1st and 2nd row of df2 - for the second iteration it will take the 2nd and 3rd row of df1 and 2nd and 3rd row of df2 ...

Afterward, perfectly, the results will be stored in a separate data frame, with 4 columns and i-1 rows.

I tried making use of apply function and of a for loop, yet my attempts failed me. I don't necessarily need a readymade solution, a hint would be nice. Thanks!

EDIT: reproducible example:

df1 <- data.frame(a = c(1, 2, 3, 4), b = c(5, 6, 7, 8))

df2 <- data.frame(c = c(4, 3, 2, 1), d = c(8, 7, 6, 5))

h <- function (a, b, c, d) {
  vector <- (a + b) / (c - d)

  vector
}

I would like to get a function that uses h until b and d reach the last row of df1/df2 (they have the same number of rows), and for each such combination generate vector and add it to some new data frame as a next row.

It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. — MrFlick, Nov 25 '19 at 20:23
@GabrielSilva How can I include a reproducible example of a function that I have no good idea of how to write? Isn't what I wrote the only thing that you need to see what I want to automatize? — Bartek Arendarski, Nov 26 '19 at 08:39
@BartekArendarski If you run the code you wrote, you will get several errors. For instance, `(c(1, 2, 3, 4), ..., c(13, 14, 15, 16))` means nothing in R and `h` is not how you define a function. Plus, you should try to make your example *minimal*. For example, you do not use columns 3 and 4, so do not include them. — Gabriel M. Silva, Nov 26 '19 at 12:32

score 0 · Answer 1 · answered Nov 26 '19 at 12:52

With apply you could do something like this:

df1 <- data.frame(a = c(1, 2, 3, 4), b = c(5, 6, 7, 8))

df2 <- data.frame(c = c(4, 3, 2, 1), d = c(8, 7, 6, 5))

h <- function (a, b, c, d) {
  (a + b) / (c - d)
}

apply(cbind(df1, df2), 1, function(x) h(x["a"], x["b"], x["c"], x["d"]))
[1] -1.5 -2.0 -2.5 -3.0

If h is a vectorized function (as in your example) it would be better to

do.call(h, cbind(df1, df2))

Of course, I am not assuming that h is that simple, in which case (df1$a + df1$b) / (df2$c - df2$d) would suffice.

However, I advise learning about the purrr package. It is great for this kind of situation and mainly: you can define what type of output you are expecting (with purrr::map_*) to ensure consistency and avoid unexpected results.

For multiple arguments of a dataframe, use purrr::pmap_*:

# `pmap` returns a list
purrr::pmap(cbind(df1, df2), h)
[[1]]
[1] -1.5

[[2]]
[1] -2

[[3]]
[1] -2.5

[[4]]
[1] -3

# `pmap_dbl` returns a double vector or throws an error otherwise
purrr::pmap_dbl(cbind(df1, df2), h)
[1] -1.5 -2.0 -2.5 -3.0

Applying an existing multi-argument function to multiple dataframes, row by row, with a joint output dataframe

1 Answers1