1

My question is very similar to this one, but the problem I am facing has a twist that those answers do not address. Specifically, I am estimating a spatial model, y=rho * lw * y + X *beta. Because the observations are related by the matrix lw, I must apply the model to the entire X matrix simultaneously. Because those answers operate row-wise, they do not apply.

Here is MWE data, consisting of twenty points across three groups and a spatial weights matrix:

library(spdep)
#Coordinates
pointcoords <- data.frame(x = runif(n=20, min =10, max = 100), y = runif(n=20, min = 10, max = 100), ID = as.character(1:20))
pointsSP <- SpatialPoints(pointcoords[,1:2])
# Weights matrix
lw <- nb2listw(knn2nb(knearneigh(pointsSP, k = 4, RANN = FALSE), 
                      row.names = pointcoords$ID))

# Data
MyData <- data.frame(ID = rep(1:20, each = 3),
                     Group = rep(1:3, times = 20),
                     DV = rnorm(60),IV = rnorm(60))

I can estimate the models by Group with dplyr

library(dplyr)
models <- MyData %>% group_by(Group) %>% 
  do(lm = lm(DV ~ IV, data = .), 
     sar = lagsarlm(DV ~ IV, data = ., listw = lw))

Predicting to new data with this answer operates on a row-wise basis, working fine for the lm objects,

MyData2 <- data.frame(ID = rep(1:20, each = 3),
                      Group = rep(1:3, times = 20),
                      IV = rnorm(60))

MyData2 %>% left_join(models) %>% rowwise %>%
  mutate(lmPred = predict(lm, newdata = list("IV" = IV))) %>% head()
#Joining by: "Group"
#Source: local data frame [6 x 6]
#Groups: 

#  ID Group         IV      lm        sar      lmPred
#1  1     1 -0.8930794 <S3:lm> <S3:sarlm> -0.21378814
#2  1     2 -1.6637963 <S3:lm> <S3:sarlm>  0.42547796
#3  1     3  0.5243841 <S3:lm> <S3:sarlm> -0.23372996
#4  2     1 -0.1956969 <S3:lm> <S3:sarlm> -0.20860280
#5  2     2  0.8149920 <S3:lm> <S3:sarlm>  0.14771431
#6  2     3 -0.3000439 <S3:lm> <S3:sarlm>  0.05082524

But not for the sar models:

MyData2 %>% left_join(models) %>% rowwise %>%
  mutate(sarPred = predict(sar, newdata = list("IV" = IV), listw=lw)) %>% head()
#Joining by: "Group"
#Error in if (nrow(newdata) != length(listw$neighbours)) stop("mismatch between newdata and spatial weights") : 
  argument is of length zero

I think there should be a better way of doing this, without joining the model to every row. Also, creating a list object for newdata won't work if you have several or changing predictor variables. It seems that the dplyr way should be something like this:

MyData2 %>% group_by(Group) %>%
  mutate(sarPred = predict(models$sar[[Group]],  newdata = ., listw=lw))

But the [[Group]] index isn't quite right.

Community
  • 1
  • 1
gregmacfarlane
  • 2,121
  • 3
  • 24
  • 53

2 Answers2

1

I'm putting this out there because it does do what I want it to, even if it needs to use a for loop (gasp)

predictobj <- list()
for(i in models$Group){
  predictobj[[i]] <- predict.sarlm(models$sar[[i]], 
                                   newdata = filter(MyData2, Group == i),
                                   listw = lw)
}

Anybody have a dplyr solution?

gregmacfarlane
  • 2,121
  • 3
  • 24
  • 53
1

I ended up doing this with do in dplyr, going through the models data.frame rowwise. I believe it does what you want, although the output doesn't contain the new data used for predictions. I did add in Group to the output, though, as it seemed necessary to keep groups separated.

models %>%
    do(data.frame(Group = .$Group, 
                predlm = predict(.$lm, newdata = filter(MyData2, Group == .$Group)), 
                predsar = predict(.$sar, newdata = filter(MyData2, Group == .$Group) , listw = lw)))

EDIT

Playing around with adding the explanatory variable into the output data.frame. The following works, although there is likely a better way to do this.

models %>%
    do(data.frame(Group = .$Group, IV = select(filter(MyData2, Group == .$Group), IV),
                predlm = predict(.$lm, newdata = filter(MyData2, Group == .$Group)), 
                predsar = predict(.$sar, newdata = filter(MyData2, Group == .$Group) , listw = lw)))
aosmith
  • 34,856
  • 9
  • 84
  • 118
  • This is good, though I think it doesn't go through `models` *rowwise*, because `models` is a list. There will certainly be some post processing to get the data appended properly, but this works for now. I think this is a better answer for the other question, as well. – gregmacfarlane Jul 22 '14 at 16:35
  • I don't have all the language from `dplyr` down pat, but the `models` object I'm working on appears to be a rowwise data.frame (`rowwise_df`). I hadn't run into this term before, but I was originally attempting to group `models` and was warned `Grouping rowwise data frame strips rowwise nature`. – aosmith Jul 22 '14 at 16:53