A proper way to plot climate data on an irregular grid

Question

I have asked this question as part of the Efficient way to plot data on an irregular grid question, but the general feedback was to split the original question in more manageable chunks. Hence, this new question.

I work with satellite data organized on an irregular two-dimensional grid whose dimensions are scanline (along track dimension, i.e. Y axis) and ground pixel (across track dimension, i.e. X axis). Latitude and longitude information for each centre pixel are stored in auxiliary coordinate variables, as well as the four corners coordinate pairs (latitude and longitude coordinates are given on the WGS84 reference ellipsoid).

Let's build a toy data set, consisting in a 12x10 potentially irregular grid and associated surface temperature measurements.

library(pracma) # for the meshgrid function
library(ggplot2)

num_sl <- 12 # number of scanlines
num_gp <- 10 # number of ground pixels
l <- meshgrid(seq(from=-20, to=20, length.out = num_gp), 
              seq(from=30, to=60, length.out = num_sl))

lon <- l[[1]] + l[[2]]/10
lat <- l[[2]] + l[[1]]/10

data <- matrix(seq(from = 30, to = 0, length.out = num_sl*num_gp), 
               byrow = TRUE, nrow = num_sl, ncol = num_gp) +
  matrix(runif(num_gp*num_sl)*6, nrow = num_sl, ncol = num_gp)

df <- data.frame(lat=as.vector(lat), lon=as.vector(lon), temp=as.vector(data))

The lon and lat data contain the centre pixel coordinates as provided in the original product I'm working with, stored as two-dimensional matrix, whose axes are ground_pixel (X axis) and scanline (Y axis). The data matrix—same dimensions—contains my measurements. I then flatten the three matrix and store them in a data frame.

I would like to plot the ground pixels (as quadrilaterals) on a map, filled accordingly with the temperature measurement.

Using tiles I get:

ggplot(df, aes(y=lat, x=lon, fill=temp)) + 
  geom_tile(width=2, height=2) +
  geom_point(size=.1) +
  borders('world', colour='gray50', size=.2) + 
  coord_quickmap(xlim=range(lon), ylim=range(lat)) +
  scale_fill_distiller(palette='Spectral') +
  theme_minimal()

But that's not what I am after. I could play with width and height to make the tiles "touch" each other, but of course that wouldn't even come close to my desired goal, which is to plot the actual projected ground pixels on the map.
Python's xarray can, for example, automatically infer the pixel boundaries given the pixel centre coordinates:

Question

Is there a way to achieve the same results in R, that is: having the pixel boundaries automatically inferred from the pixel centres, and plotting the pixels as filled polygons on a map? Maybe using the sf package?

I can see it done in the answer to this question but the answer that refers to using sf is a bit unclear to me, as it deals with different projections and potentially regular grids, whereas in my case I suppose I don't have to re-project my data, and, furthermore, my data is not on a regular grid.

If this is not possible, I suppose I can resort to use the pixel boundaries information in my products, but maybe that's a topic for another question should this one prove to be not easy to tackle.

You say that you have the coordinates of the corners of each tile? I would suggest using `sf` to create the tiled grid and the development version of `ggplot` to plot with `geom_sf`. Provided the CRS is set properly when making these polygons, it should be possible to get the desired python plot. Exactly how to do that depends on how the coordinates and temp measurements are stored - current example data only has center pixels, right? — Calum You, Mar 01 '18 at 15:02
Yes that's correct. I was hoping for an easy way to infer the pixels boundaries from the pixel centres. I've seen this: `polys = as(SpatialPixelsDataFrame(orig_grid, orig_grid@data, tolerance = 0.149842),"SpatialPolygonsDataFrame")` done in this [answer](https://stackoverflow.com/a/43681357/7189809), but how this actually works is a bit out of my comprehension at the moment. But yes, I can use the pixel boundaries, in fact I am doing that already, but it implies creating IDs columns and merging two data frames, and it takes time with million of points. I'll post another question for that. — stm4tt, Mar 01 '18 at 20:06
@stm4tt using the answer you point to will not work here I think because your grid of points are not aligned. The key in this answer was that the grid centers were indeed in wgs lat long but that the original grid was projected in another crs. Reprojecting the cells enters in the original crs made the points aligned and suitable for a `SpatialPixels` transformation. Is it possible to share the original NetCDF data to check the crs ? — Gilles San Martin, Mar 01 '18 at 22:26
@Gilles I see, so I guess the only way is to make use of the provided pixel corner points, build polygons out of them, construct an `sf` spatial data frame and proceed from there (e.g. `ggplot` + `geom_sf`). I'll give it a try. As for the original NetCDF, it's a 600+MB file, too big to share (also not really allowed to). But I `print(nc)`ed it and pasted it [here](https://pastebin.com/gS4cL63X). — stm4tt, Mar 02 '18 at 08:53
I'm not sure to understand why you want polygons but there seem also to a lot of work-flows to read directly the NetCDF files in R as rasters (or raster like). See examples [here](http://geog.uoregon.edu/GeogR/topics/netcdf-to-raster.html) or [here](http://geog.uoregon.edu/bartlein/courses/geog490/week04-netCDF.html). — Gilles San Martin, Mar 02 '18 at 12:44
Have you found some less convoluted way to do this? Or are the solutions here still the only ones? — Herman Toothrot, Mar 17 '23 at 17:22

dww · Answer 1 · 2019-06-23T21:54:18.597

Here's one way to do this. There may be something simpler, but this works.

First I'm going to use the raster package to manipulate the coordinates. The rasters I'm creating here are 'unconventional' in that the values they contain are the location data. But using rasters rather than matrices for this gives access to a few helpful functions such as extend and, most usefully, resample, with its bilinear interpolation function whch I'll use to find the vertices

library(raster)
latr = raster(lat)
lonr = raster(lon)

find.vertices = function(m){
  r = raster(m)
  vertices = raster(matrix(nrow = dim(r)[1]+1, ncol = dim(r)[2]+1))
  x = extend(r, c(1,1))
  x[1,] = 2*x[2,] - x[3,]
  x[dim(x)[1],] = 2*x[dim(x)[1]-1,] - x[dim(x)[1]-2,]
  x[,1] = 2*x[,2] - x[,3]
  x[,dim(x)[2]] = 2*x[,dim(x)[2]-1] - x[,dim(x)[2]-2,]
  extent(vertices) = extent(r) + res(r)
  vertices = resample(x, vertices)
}

latv = find.vertices(lat)
lonv = find.vertices(lon)
df2 = data.frame(xc = lonv[], yc = latv[])

Let's plot these vertices to check we are on track:

ggplot(df, aes(y=lat, x=lon, fill=temp)) + 
  geom_tile(width=2, height=2) +
  geom_point(size=.1) +
  geom_point(aes(xc, yc), data=df2, inherit.aes =F) +
  borders('world', colour='gray50', size=.2) + 
  coord_quickmap(xlim=range(lon), ylim=range(lat)) +
  scale_fill_distiller(palette='Spectral') +
  theme_minimal()

Now we create some Polygon from these vertices

nx = NCOL(latv)
ny = NROW(lonv)
polys = list()
for (i in 1:length(data)) {
  x = col(data)[i]
  y = row(data)[i]
  polys[[i]] = Polygon(cbind(
      lonv[c((x-1)*ny + y, (x-1)*ny + y + 1, x*ny + y + 1, x*ny + y, (x-1)*ny + y)], 
      latv[c((x-1)*ny + y, (x-1)*ny + y + 1, x*ny + y + 1, x*ny + y, (x-1)*ny + y)]
    ))
}

Convert the list of Polygon into a SpatialPolygonsDataFrame

Polys = sapply(1:length(polys), function(i) Polygons(polys[i], i))
SPolys = sapply(1:length(polys), function(i) SpatialPolygons(Polys[i], i))
SPolys = do.call(bind, SPolys)
SPolysdf = SpatialPolygonsDataFrame(SPolys, data.frame(data=as.vector(data)))

To plot this object in ggplot, we need to convert to a conventional data.frame. Traditionally most people have used fortify for this. But the ggplot documentation warns that this could be deprecated and recommends to use the broom package instead. I'm not too familiar with broom, but I decidedf to follow this advice like so:

library(broom)
ggSPolysdf = tidy(SPolysdf)
ggSPolysdf = cbind(ggSPolysdf, data = rep(as.vector(data), each=5))

And finally we can plot:

ggplot(df, aes(y=lat, x=lon, fill=temp)) + 
  geom_polygon(aes(long,lat,fill=data, group = id), data=ggSPolysdf) +
  borders('world', colour='gray50', size=.2) + 
  coord_quickmap(xlim=range(lon), ylim=range(lat)) +
  scale_fill_distiller(palette='Spectral') +
  theme_minimal()

excellent answer. A possible improvement would be to use `sf` rather than `sp`, which I think will also be supported in a future `ggplot2` release — SymbolixAU, Mar 06 '18 at 22:17

Christian Rau · Answer 2 · 2019-07-10T21:15:14.117

The solution below essentially uses the answer from @dww and makes some changes that seem to be necessary to get the figures (at least on my platform). These changes relate first, to the definition of the polygons defining the 'skewed pixels' from the last figure; and secondly, to the question of how to condense the polygons into a data frame. For the second question, the sf package, as suggested by @SymbolixAU, is used.

library(raster)
latr = raster(lat)
lonr = raster(lon)

find.vertices = function(m){
 r = raster(m)
 vertices = raster(matrix(nrow = dim(r)[1]+1, ncol = dim(r)[2]+1))
 x = extend(r, c(1,1))
 x[1,] = 2*x[2,] - x[3,]
 x[dim(x)[1],] = 2*x[dim(x)[1]-1,] - x[dim(x)[1]-2,]
 x[,1] = 2*x[,2] - x[,3]
 x[,dim(x)[2]] = 2*x[,dim(x)[2]-1] - x[,dim(x)[2]-2,]
 extent(vertices) = extent(r) + res(r)
 vertices = resample(x, vertices)
}

latv = find.vertices(lat)
lonv = find.vertices(lon)
df2 = data.frame(xc = lonv[], yc = latv[])

Let's plot these vertices to check we are on track:

ggplot(df, aes(y=lat, x=lon, fill=temp)) + 
  geom_tile(width=2, height=2) +
  geom_point(size=.1) +
  geom_point(aes(xc, yc), data=df2, inherit.aes=F) +
  borders('world', colour='gray50', size=.2) + 
  coord_quickmap(xlim=range(lon), ylim=range(lat)) +
  scale_fill_distiller(palette='Spectral') +
  theme_minimal()

Now we create polygons from these vertices:

nx = NCOL(latv)

polys = list()
for (i in 1:length(data)) {
  x = col(data)[i]
  y = row(data)[i]

  polys[[i]] = Polygon(cbind(
      lonv[c((y-1)*nx + x, (y-1)*nx + x + 1, y*nx + x + 1, y*nx + x, (y-1)*nx + x)], 
      latv[c((y-1)*nx + x, (y-1)*nx + x + 1, y*nx + x + 1, y*nx + x, (y-1)*nx + x)]
    ))
}

Convert the list of polygons into a SpatialPolygonsDataFrame:

Polys = sapply(1:length(polys), function(i) Polygons(polys[i], i))
SPolys = sapply(1:length(polys), function(i) SpatialPolygons(Polys[i], i))
SPolys = do.call(bind, SPolys)
SPolysdf = SpatialPolygonsDataFrame(SPolys, data.frame(data=as.vector(data)))

Using fortify to convert into a data frame would be done by the following two lines: (Caveat: as noted by @dww, this solution is not recommended by the ggplot2 documentation.)

ggSPolysdf_0 = fortify(SPolysdf)
ggSPolysdf = cbind(ggSPolysdf_0, data = rep(as.vector(data), each=5))

An alternative is to use the sf package. In the following code, the command st_coordinates plays the role of fortify in ggplot2. Note that with the present method, the names of the variables are lost in the conversion and need to be restored by hand:

library(sf)
sfSPolys = st_as_sf(SPolysdf)
coord_xy_SPolys = st_coordinates(sfSPolys)
coord_xyz_SPolys = cbind(coord_xy_SPolys, data = rep(as.vector(data), each=5))
ggSPolysdf = as.data.frame(coord_xyz_SPolys)
colnames(ggSPolysdf) <- c("long", "lat", "piece", "id", "data")

And finally we can plot:

ggplot(df, aes(y=lat, x=lon, fill=temp)) + 
  geom_polygon(mapping=aes(long,lat,fill=data, group=id), data=ggSPolysdf) +
  borders('world', colour='gray50', size=.2) + 
  coord_quickmap(xlim=range(lon), ylim=range(lat)) +
  scale_fill_distiller(palette='Spectral') +
  theme_minimal()

A proper way to plot climate data on an irregular grid

Question

2 Answers2