0

I'm trying to perform geographically weighted regression (gwr) by following this example using my data. My data are raster files which I converted them to a df, so now I work with the latter (based on this example where they use the meuse data set). When I try to plot the residuals using this command

map.resids <- SpatialPointsDataFrame(data=data.frame(resids),coords=cbind(x,y))

I am getting this error

Error in dimnames(coords@coords) <- *vtmp*:length of 'dimnames' 1 not equal to array extent`.

Any idea on how to add coordinates to the residuals? Below is my code.

ntl = read.csv(file = "path/ntl_df.csv")
pop = read.csv(file = "path/pop_df.csv")
evi = read.csv(file = "path/evi_df.csv")
ntl_coords = read.csv(file = "path/ntl_df_coords.csv")

all = cbind(ntl, nir,  pop, evi, tir)

#global linear model
model1 <- lm(ntl_clip ~ pop_clip + evi_clip, data = all)
summary(model1)
plot(model1, which=3)

resids<-residuals(model1)
colours <- c("dark blue", "blue", "red", "dark red") 
map.resids <- SpatialPointsDataFrame(data=data.frame(resids), coords=ntl_coords)

My data can be found here.

####### EDIT #######

I found a work around to extract the coordinates from my original raster by using

ntl_coords = xyFromCell(ntl_r, 1:ncell(ntl_r))
ntl_coords = as.data.frame(ntl_coords)

but the resulted df has 14145 rows while my raster, ntl_clip, (after converting to a df) has 4243 rows.

Also, here is an image that shows my dependent variable (ntl_clip), the study area (red polygon) and the residuals (orange dots). As can be seen, the residuals do extent outside of the study area and they do not cover the whole area. residuals

Dolby
  • 15
  • 6
  • 3
    Rather than us needing to download your data from a third party site, is it possible to test this out using, for example, the meuse data, since it's available in common spatial analysis packages instead? That also might help simplify the question / improve debugging – camille Jan 16 '22 at 00:11
  • Thanks camille, I tested the meuse data, and the code runs fine. The problem is with my data, that's why I uploaded them. – Dolby Jan 16 '22 at 15:27
  • Okay, that makes sense. Debugging step two would be does the problem occur with every file, or just some of them? I'm just wondering if you can pinpoint the issue. You're also missing whatever nir and tir are—are those essential to building an example version of the model? – camille Jan 16 '22 at 19:47
  • I haven't include the NIR and TIR spectral channels on my lm model for speed processing reasons. The problem does occur with every covariate. I think I have pinpointed the problem. As I mention in the post, when converting my raster (ntl_clip) to a df, the latter has 4243 rows, while when I convert the coordinates to a df the resulting table has 14145. Could it be the difference in rows are the NaN values? – Dolby Jan 16 '22 at 20:36

1 Answers1

0

I found the solution based on this article, the answer of Simon O'Hanlon. Before the ntl_coords <- xyFromCell(ntl_r,1:ncell(ntl_r)) one simply has to do vals <- values(ntl_r). So, the code is:

ntl_r = raster("path/ntl_clip.tif")
vals <- values(ntl_r)
ntl_coords <- xyFromCell(ntl_r,1:ncell(ntl_r))
combine <- cbind(ntl_coords,vals)

Of course after the cbind I deleted all the NaN values manually, outside of R.

Dolby
  • 15
  • 6