4

I have a dataset similar to the below example

df <- structure(list(Species = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,3L, 
1L, 2L, 3L), .Label = c("setosa", "versicolor", "virginica"), class = 
"factor"), flower_att = c("Sepal.Length", "Sepal.Length", "Sepal.Length", 
"Sepal.Width", "Sepal.Width", "Sepal.Width", "Petal.Length", "Petal.Length", 
"Petal.Length", "Petal.Width", "Petal.Width", "Petal.Width"), measurement = 
c(5.1, 7, 6.3, 3.5, 3.2, 3.3, 1.4, 4.7, 6, 0.2, 1.4, 2.5), month = 
c("January", "February", "January", "February", "January", "February", 
"January", "February", "January", "February", "January", "February")), 
row.names = c(NA,-12L), class = "data.frame")

I want to display both sepal length and width for each species and month side by side. I was hoping to do this using a diagonal split cell in the heatmap with 2 different colour legends i.e. red for length and blue for width. If possible I would like the value to be displayed within the cell segment. My search so far has found this closest example but I am looking for a workable ggplot version.

My own attempt currently looks like the below. I cannot work out how to break up the cells.

ggplot(df, aes(x=month, y=Species)) +   geom_tile(aes(fill=measurement), 
color="black") +   theme(axis.text.x = element_text(angle=45, hjust = .5)) +   
geom_text(aes(label = round(measurement, .1))) +   scale_fill_gradient(low = 
"white", high = "red")

Update

After some serious digging through the internet I have found a potential option using geom_segment and geom_text_repel, see below. Could anyone tell me if this a viable option to pursue? If so how can I get it to meet the requirements above?

I am open to switching scale_fill_gradient to scale_fill_manual or other alternative, my main objective is to have the all data displayed side by side

ggplot(df, aes(x=month, y=Species)) +
geom_tile(aes(fill=measurement), color="black") +
theme(axis.text.x = element_text(angle=45, hjust = .5)) +
geom_text_repel(aes(label = round(measurement, .1))) +
scale_fill_gradient(low = "white", high = "red")

gb <- ggplot_build(p)

p + geom_segment(data=gb$data[[1]],
aes(x=xmin, xend=xmax, y=ymin, yend=ymax), color="black")
tjebo
  • 21,977
  • 7
  • 58
  • 94
AudileF
  • 436
  • 2
  • 10
  • what have you tried so far? – Mike Jan 25 '21 at 14:59
  • Hi Mike ive used the following so far but I cannot workout how to perform the split `ggplot(df, aes(x=month, y=Species)) + geom_tile(aes(fill=measurement), color="black") + theme(axis.text.x = element_text(angle=45, hjust = .5)) + geom_text(aes(label = round(measurement, .1))) + scale_fill_gradient(low = "white", high = "red")` – AudileF Jan 25 '21 at 15:02
  • 1
    good question! I am not sure how to do that, I would also edit your question to put the ggplot code in there so other people can help troubleshoot. – Mike Jan 25 '21 at 15:10
  • 1
    There appears to be other efforts on this, such as in [this post](https://stackoverflow.com/questions/48531257/ggplot-geom-point-how-to-set-font-of-custom-plotting-symbols). – Ben Jan 25 '21 at 19:55
  • Thanks for the link @Ben I tried the askers example and it does not give the same result as they got. But Ill see if I can work it around :) – AudileF Jan 26 '21 at 08:11
  • I found 2 attempts to create a `geom_triangle`, [one is available on GitHub](https://rdrr.io/github/GuangchuangYu/gglayer/src/R/geom_triangle.R), the other only on [a source code on rdrr.io](https://rdrr.io/github/houyunhuang/ggtriangle/src/R/geom-triangle.R). Both do draw triangles, but the size for the first is very difficult to change and the otherwise clever fill aesthetics in the latter are breaking - they are turned to categorical and there is no immediate workaround to make it continuous – tjebo Jan 27 '21 at 11:21

1 Answers1

0

This is slightly hacky, but to be honest, without creating a dedicated geom, I don't think you can get it less hacky - and creating a geom can also get somewhat hacky :)

  • Creating triangle polygons for each x/y coordinate with sapply. I guess you could use that approach for your compute_group layer in your future StatSplitTile.
  • The messing with factors is a necessary evil to get the order right. If you want a specific order in your y axis, you would also need to factorise Species first.
  • Using ggnewscale for a very simple way of having several fill scales.
  • set limits to the same for better comparability
  • coord_equal to make it look nicer
library(tidyverse)

mydat <- structure(list(Species = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("setosa", "versicolor", "virginica"), class = "factor"), flower_att = c("Sepal.Length", "Sepal.Length", "Sepal.Length", "Sepal.Width", "Sepal.Width", "Sepal.Width", "Petal.Length", "Petal.Length", "Petal.Length", "Petal.Width", "Petal.Width", "Petal.Width"), measurement = c(5.1, 7, 6.3, 3.5, 3.2, 3.3, 1.4, 4.7, 6, 0.2, 1.4, 2.5), month = c("January", "February", "January", "February", "January", "February", "January", "February", "January", "February", "January", "February")),
  row.names = c(NA, -12L), class = "data.frame"
)

make_triangles <- function(x, y, point = "up") {
  x <- as.integer(as.factor((x)))
  y <- as.integer(as.factor((y)))

  if (point == "up") {
    newx <- sapply(x, function(x) {
      c(x - 0.5, x - 0.5, x + 0.5)
    }, simplify = FALSE)
    newy <- sapply(y, function(y) {
      c(y - 0.5, y + 0.5, y + 0.5)
    }, simplify = FALSE)
  } else if (point == "down") {
    newx <- sapply(x, function(x) {
      c(x - 0.5, x + 0.5, x + 0.5)
    }, simplify = FALSE)
    newy <- sapply(y, function(y) {
      c(y - 0.5, y - 0.5, y + 0.5)
    }, simplify = FALSE)
  }
  data.frame(x = unlist(newx), y = unlist(newy))
}

# required, otherwise you cannot use the values as fill
mydat_wide <- mydat %>% pivot_wider(names_from = "flower_att", values_from = "measurement")
# making your ordered months factor
mydat_wide$month <- droplevels(factor(mydat_wide$month, levels = month.name))
# The actual triangle computation
newcoord_up <- make_triangles(mydat_wide$month, mydat_wide$Species)
newcoord_down <- make_triangles(mydat_wide$month, mydat_wide$Species, point = "down")
# just a dirty trick for renaming
newcoord_down <- newcoord_down %>% select(xdown = x, ydown = y)
# you need to repeat each row of your previous data frame 3 times
repdata <- map_df(1:nrow(mydat_wide), function(i) mydat_wide[rep(i, 3), ])
newdata <- bind_cols(repdata, newcoord_up, newcoord_down)

ggplot(newdata) +
  geom_polygon(aes(x = x, y = y, fill = Sepal.Length, group = interaction(Species, month)), color = "black") +
  scale_fill_gradient(low = "white", high = "red", limits = c(0, 10)) +
  ggnewscale::new_scale_fill() +
  geom_polygon(aes(x = xdown, y = ydown, fill = Sepal.Width, group = interaction(Species, month)), color = "black") +
  scale_fill_gradient(low = "white", high = "red", limits = c(0, 10)) +
  scale_x_continuous(breaks = seq_along(unique(mydat_wide$month)), 
                     labels = unique(levels(mydat_wide$month))) +
  scale_y_continuous(breaks = seq_along(unique(mydat_wide$Species)),
                     labels = unique(mydat_wide$Species))+
  coord_equal()

Created on 2021-01-27 by the reprex package (v0.3.0)

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • P.S. - the grey triangles are NA, this is because of your provided data – tjebo Jan 27 '21 at 12:34
  • Also - I am not sure about you categorical bit - maybe I misunderstood what you wanted to plot. – tjebo Jan 27 '21 at 12:51
  • Thanks tjebo this looks like what I want. However I dont understand why there should be NA's in the data. Ideally the length should be the top triangle and the width the bottom for each 'cell' in the heatmap, hope that clarify things. – AudileF Jan 27 '21 at 13:04
  • @AudileF for example, check if you gave us a "setosa, January, Sepal.Width" (hint - you didn't) – tjebo Jan 27 '21 at 13:30