1

I've done the rounds here and via google without a solution, so please help if you can.

I'm looking to create something like this : painSensitivityHeatMap using ggplot2

I can create something kinda similar using geom_tile, but without the smoothing between data points ... the only solution I have found requires a lot of code and data interpolation. Not very elegant, me thinks.uglySolutionUsingTile

So I'm thinking, I could coerce the density2d plots to my purposes instead by having the plot use fixed values rather than a calculated data-point density -- much in the same way that stat='identity' can be used in histograms to make them represent data values, rather than data counts.

So a minimal working example:

df <- expand.grid(letters[1:5], LETTERS[1:5])
df$value <- sample(1:4, 25, replace=TRUE)

# A not so pretty, non-smooth tile plot
ggplot(df, aes(x=Var1, y=Var2, fill=value)) + geom_tile()

# A potentially beautiful density2d plot, except it fails :-(
ggplot(df, aes(x=Var1, y=Var2)) + geom_density2d(aes(color=..value..))
Community
  • 1
  • 1
Søren ONeill
  • 355
  • 2
  • 13
  • I guess geom_density requires a continuous input (in contrast to a heatmap). Perhaps convert to numeric and change the labels? See https://wahani.github.io/2015/12/smoothScatter-with-ggplot2/ – timfaber May 03 '17 at 12:54
  • See `?stat_contour`. You'll need a model to interpolate if you want to smooth things out. – Axeman May 03 '17 at 13:20
  • I would look into https://en.wikipedia.org/wiki/Kriging. The `kriging` package has a one line solution to interpolate the data to plot. – troh May 03 '17 at 14:04

1 Answers1

0

This took me a little while, but here is a solution for future reference

A solution using idw from the gstat package and spsample from the sp package.

I've written a function which takes a dataframe, number of blocks (tiles) and a low and upper anchor for the colour scale.

The function creates a polygon (a simple quadrant of 5x5) and from that creates a grid of that shape.

In my data, the location variables are ordered factors -- therefor I unclass them into numbers (1-to-5 corresponding to the polygon-grid) and convert them to coordinates -- thus converting the tmpDF from a datafra to a spatial dataframe. Note: there are no overlapping/duplicate locations -- i.e 25 observations corresponding to the 5x5 grid.

The idw function fills in the polygon-grid (newdata) with inverse-distance weighted values ... in other words, it interpolates my data to the full polygon grid of a given number of tiles ('blocks').

Finally I create a ggplot based on a color gradient from the colorRamps package

painMapLumbar <- function(tmpDF, blocks=2500, lowLimit=min(tmpDF$value), highLimit=max(tmpDF$value)) {
  # Create polygon to represent the lower back (lumbar)
  poly <- Polygon(matrix(c(0.5, 0.5,0.5, 5.5,5.5, 5.5,5.5, 0.5,0.5, 0.5), ncol=2, byrow=TRUE))

  # Create a grid of datapoints from the polygon
  polyGrid <- spsample(poly, n=blocks, type="regular") 
  # Filter out the data for the figure we want
  tmpDF <- tmpDF %>% mutate(x=unclass(x)) %>% mutate(y=unclass(y)) 
  tmpDF <- tmpDF %>% filter(y<6) # Lumbar region only
  coordinates(tmpDF) <- ~x+y
  # Interpolate the data as Inverse Distance Weighted
  invDistanceWeigthed <- as.data.frame(idw(formula = value ~ 1, locations = tmpDF, newdata = polyGrid))
  p <- ggplot(invDistanceWeigthed, aes(x=x1, y=x2, fill=var1.pred)) + geom_tile() +  scale_fill_gradientn(colours=matlab.like2(100), limits=c(lowLimit,highLimit)) 
  return(p)
}

I hope this is useful to someone ... thanks for the replies above ... they helped me move on.

Søren ONeill
  • 355
  • 2
  • 13