How to make calculations across two different lists of dataframes?

Question

I have two lists of data frames, such that data is a list of 47 data frames, where each data frame has columns [coords, x, y, liklihood, x.1, x.2, liklihood.1, etc.] and dataA is a list of 47 data frames each of the same length as those in data, but with fewer columns [coords, x, y] that represent different coordinates.

I want to create a third list, or add a column to each data frame in one of the lists, that will contain the distance calculation from pointDistance(p1, p2) where p1 is the x and y columns of each data frame in list data, and p2 is the x and y columns of each data frame in list dataA.

I am trying to keep the dataframes in lists rather than having 47*2 individual data frames in my global environment.

Minimal Reproducible Example:

coords <- rnorm(10)
x <- rnorm(10)
y <- rnorm(10)
liklihood <- rnorm(10)
x.1 <- rnorm(10)
y.1 <- rnorm(10)

day1 <-  data.frame(coords,x,y,liklihood,x.1,y.1)

coords <- rnorm(10)
x <- rnorm(10)
y <- rnorm(10)
liklihood <- rnorm(10)
x.1 <- rnorm(10)
y.1 <- rnorm(10)

day2 <-  data.frame(coords,x,y,liklihood,x.1,y.1)

data <- list(day1,day2)

coords <- rnorm(10)
x <- rnorm(10)
y <- rnorm(10)
liklihood <- rnorm(10)

day1 <-  data.frame(coords,x,y,liklihood)

coords <- rnorm(10)
x <- rnorm(10)
y <- rnorm(10)
liklihood <- rnorm(10)


day2 <-  data.frame(coords,x,y,liklihood)

dataA <- list(day1,day2)

Could you add a [minimale reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610)? The MRE will make it easier for others to find and test a answer to your question. That way you can help others to help you! — dario, Mar 07 '20 at 20:13
What have you tried so far? What is your distance calculation function? — dario, Mar 07 '20 at 21:18
I can't figure out how to make calculations across the lists to even really try anything... I am using the pointDistance function: https://www.rdocumentation.org/packages/raster/versions/3.0-12/topics/pointDistance — K8Otter, Mar 07 '20 at 21:38
You could use loops for that. For each entry in data you calculate the distance to each entry ind dataA. Loops within loops ;) If you are not sure how to do that I highly suggest checking some of the many R tutorials. — dario, Mar 07 '20 at 21:57
Maybe I misread your question and you want something else.. I'm not sure I understand what you meant with: *pointDistance(p1, p2) where p1 is the x and y columns of* **each** *data frame in list data, and p2 is the x and y columns of **each** data frame in list dataA.* — dario, Mar 07 '20 at 23:55
You got an answer from Allan Cameron. If this answered your question you should accept it (by clicking on the check mark) and could consider up-voting it. If it did not answer your question I'd suggest updating the question and clarifying what it is you want and add an example how the result should look like. That way you can help others to help you ;)! — dario, Mar 08 '20 at 00:22

score 2 · Accepted Answer · answered Mar 07 '20 at 23:32

You can use mapply in base R to do this.

First, write a function that would return a single correct data frame if it was given a pair of data frames from your two lists, like data[[1]] and dataA[[1]]

library(raster)

append_distances <- function(df1, df2)
{
  df1$distance <- pointDistance(cbind(df1$x, df1$y), cbind(df2$x, df2$y), lonlat = FALSE)
  return(df1)
}

Now we just pass this function and your two lists to mapply:

data <-  mapply(append_distances, data, dataA, SIMPLIFY = FALSE)

and now each data frame indata has a distance column added:

data
#> [[1]]
#>        coords          x           y  liklihood         x.1        y.1  distance
#> 1   0.4761741  0.7913819  0.11597299 -0.6159504 -0.17626836 -0.8649915 2.1378779
#> 2   0.2608518  0.4389639 -1.44510285 -0.5452702 -2.31927588 -0.5114613 3.0321765
#> 3   2.1098629  0.3457442  1.59630572 -0.3205454  0.25760236  1.6791924 0.4150714
#> 4   0.5937334 -0.2043505  0.23667944 -0.2480409 -0.52856599 -0.4263619 1.6662791
#> 5   0.2819461 -1.9768319  0.68344331 -0.4975349 -0.08315893  0.9271072 2.3841079
#> 6   0.5779044 -0.5706433  0.89377684 -1.0084165 -0.83697268  0.9928353 0.6818632
#> 7   0.1410554 -0.6133513  0.25957971 -0.1781339 -0.77489990 -0.7191718 0.8303696
#> 8  -1.1769578  0.9203776 -0.06258728 -0.8991639 -0.38907408 -0.8388408 0.5028145
#> 9  -0.1388739 -0.8279408  1.15568431 -0.3312423  1.17269754 -1.4530041 1.6042288
#> 10 -0.3755364  0.6285803  0.52453490  0.7323463 -0.49051839 -0.1949171 0.6205714
#> 
#> [[2]]
#>        coords           x          y  liklihood        x.1          y.1  distance
#> 1   2.2158425  0.16430566 -0.5721804 -0.7523029  0.2866881 -2.027529031 0.4418775
#> 2   1.5753250 -0.67190607 -0.1140359 -0.3125333 -0.5361148  0.153228235 1.7182954
#> 3   0.8558108  1.19404509 -1.5834463  0.3858246  0.4475970  0.460910344 1.6229581
#> 4   0.8027824  0.76579023 -0.5938679  0.5592208  0.5883806  0.231569460 3.3608275
#> 5  -1.1487244  0.01013471  0.6855049  0.7148735 -2.2822053  1.918921619 2.3790501
#> 6   0.1014336  0.73941541 -0.4487482  0.1758588  0.8579709  0.029777437 1.8923570
#> 7  -0.8238857  0.67911991 -0.9140873 -0.6887611 -1.0709704 -0.009789701 1.4694983
#> 8  -0.1553338  0.78560221 -0.8218460 -0.5537232  0.7295692  0.744225760 2.4279377
#> 9  -0.6297834  0.09747354  0.2048211 -1.0849396 -0.2201589  0.173386536 0.8638957
#> 10 -0.4616377 -0.51116686  0.3204535 -0.5285903  1.0053890 -0.534173400 1.0715881

How to make calculations across two different lists of dataframes?

1 Answers1