12

I have three data frames as below. I wish to combine them into one data frame according to Lon & Lat, and average the 3 values for each 'cell'. I have read this (calculate average over multiple data frames) and attempted to utilise aggregate but to no avail....any pointers appreciated.

> head(CSR.GRACE[,c(1:14)],10)
    Lon  Lat   January  February     March     April       May     June        July     August  September   October    November  December
1  28.5 -4.5 17.710425 13.855327 12.385712 13.558101 12.789865 6.913783  1.03770075 -5.3901741 -6.6351015 -7.661375 -3.09337944 6.0659410
2  29.5 -4.5 14.010154 10.257435  9.009641 10.275778  9.598241 5.166972  0.73570247 -4.2733162 -5.0861417 -5.850192 -2.93521806 4.1240150
3  30.5 -4.5 16.288443 10.467614  9.275714 10.904162 10.228808 5.364853  0.50089883 -4.7478741 -5.4320069 -6.316568 -3.80160315 3.8494745
4  31.5 -4.5 18.560677  9.932461  9.239592 11.037748 10.551886 5.281853  0.01181973 -4.9034324 -5.3504391 -6.438050 -4.41695714 3.3432301
5  32.5 -4.5 10.171202  4.476512  4.509140  5.448872  5.338991 2.556262 -0.22646611 -2.3274204 -2.4376636 -3.103697 -2.27586145 1.3641930
6  33.5 -4.5 14.040068  5.349344  5.772618  7.158792  7.121341 3.407587 -0.30616689 -2.6800099 -2.7955420 -3.803622 -2.77898997 1.4021380

> head(GFZ.GRACE[,c(1:14)],10)
    Lon  Lat   January  February     March     April       May      June     July     August September   October   November  December
1  28.5 -4.5 15.642782 15.521720 11.823875 19.825865 17.335761 11.208188 5.080615 -3.0897644 -5.733351 -4.196604 -1.6697661 10.744696
2  29.5 -4.5 12.164074 10.931418  8.622238 15.341911 12.969769  8.521280 4.072790 -2.4301791 -4.551170 -3.055914 -1.2260079  7.592880
3  30.5 -4.5 13.579305 10.267520  8.787406 16.567715 13.745143  9.121496 4.497849 -2.6723491 -5.022949 -3.269881 -1.0691039  7.377143
4  31.5 -4.5 14.501465  8.600480  8.259757 16.981533 14.054429  9.318550 4.582672 -2.7917893 -5.249895 -3.636936 -0.5141342  6.770836
5  32.5 -4.5  7.311216  3.249596  3.513870  8.430777  6.941659  4.572560 2.203461 -1.4106516 -2.661226 -2.113089  0.2459282  3.049897
6  33.5 -4.5  9.121348  3.113245  3.584976 11.040761  8.732950  5.772059 2.811168 -1.8554437 -3.524447 -3.272863  1.2493973  3.750694

> head(JPL.GRACE[,c(1:14)],10)
    Lon  Lat   January  February     March     April       May     June     July     August  September    October   November   December
1  28.5 -4.5 19.559790 14.544438 12.035112 13.944141 11.931011 7.513007 3.095003 -3.6165702 -6.5945043 -7.2498567 -4.5402436  6.3935236
2  29.5 -4.5 15.740160 11.192191  8.549782 10.783359  9.401173 5.834498 2.267822 -2.6354346 -4.8939197 -5.5912996 -3.7295148  4.1461123
3  30.5 -4.5 18.984714 12.014807  8.510139 11.628697 10.635699 6.448064 2.260429 -2.6979695 -5.2102337 -6.2646164 -4.2713238  3.5089825
4  31.5 -4.5 22.794356 11.993054  8.162500 11.813746 11.747350 6.955983 2.164615 -2.5707902 -5.3448873 -6.7473006 -4.5777496  2.5609555
5  32.5 -4.5 13.233634  5.606305  3.880347  5.753024  6.388978 3.742596 1.096214 -1.1103189 -2.6367831 -3.4102675 -2.2860237  0.7826054
6  33.5 -4.5 19.260989  6.761722  4.978247  7.373498  9.135645 5.421030 1.706414 -1.0796434 -3.3122886 -4.2114588 -2.8110246  0.4825075
Community
  • 1
  • 1
Darren J
  • 503
  • 2
  • 5
  • 16

5 Answers5

16

You can do:

library(data.table)

rbindlist(list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))[,lapply(.SD,mean), list(Lon, Lat)]

Explanations:

Your data.frames are put into a list and 'superposed horizontaly' using rbindlist (which returns a data.table). We do this since your data.frame has the same structure (same number and name of columns, same type of data). An alternative approach would have been to do do.call(rbind, list(JPL.GRACE,GFZ.GRACE,CSR.GRACE)).

We then loop over each distinct pair of Lon, Lat. .SD represents the data.table associated with each pair. You can see it by doing:

dt = rbindlist(list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))
dt[,print(.SD), list(Lon, Lat)]

For each of these .SD, we simply loop over the columns and compute the means.

Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87
6

This could be done very easily with an 3-wide array using 1:2 as the "MARGIN":

install.packages('abind')
library(abind)
temp_array <- abind(CSR.GRACE, GFZ.GRACE, JPL.GRACE, along=3)
res <- apply(temp_array, 1:2, mean)

Here's a simple example:

 x <- matrix(1:12,3,4)
 y <- x+100; z= y-50
 apply( abind::abind(x,y,z, along=3),  1:2, mean)
     [,1] [,2] [,3] [,4]
[1,]   51   54   57   60
[2,]   52   55   58   61
[3,]   53   56   59   62
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • I am getting the following error NameError: name 'library' is not defined – Mahmud Dec 23 '21 at 14:06
  • The error text makes me think you wonder if you failed to quote it inside `install.packages(.)`. The `install.packages` function does need to have it's first argument quoted. If they are not quoted, the unquoted argument is used as an R name for a character data object. – IRTFM Dec 23 '21 at 18:16
  • Actually i was trying in Python. Then i realized the solution is given for R – Mahmud Dec 30 '21 at 05:07
4

Since your data is all numeric, you could put it in a 3d array and use rowMeans

library(abind)

arr = abind(JPL.GRACE, GFZ.GRACE, CSR.GRACE, along = 3)
rowMeans(arr, dims = 2)
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
2

Maybe the sqldf package can help you:

library(sqldf)
df1 <- CSR.GRACE[,c(1:14)]
df2 <- GFZ.GRACE[,c(1:14)]
df3 <- JPL.GRACE[,c(1:14)]
# This could be done with rbind(), but I'll use sqldf()
# I'm assuming all data frames have the same columns:
df <- sqldf('select * from df1 
             union all select * from df2 
             union all select * from df3')
# The average can be done also with sqldf (just a demo)
sqldf('select Lon, Lat, avg(January) as jan, avg(February) as feb
       from df
       group by Lon, Lat')

There may be better solutions, but this is a simple way.

Hope this helps

Barranka
  • 20,547
  • 13
  • 65
  • 83
0

Here's a base R solution:

r <- do.call(rbind, list(JPL.GRACE,GFZ.GRACE,CSR.GRACE))
r <- aggregate(r[-c(1,2)], list(r$Lon, r$Lat), mean)
BHD
  • 21
  • 4