4

Imagine I plot this toy data:

lev <- c("A", "B", "C", "D")

nodes <- data.frame(ord=c(1,1,1,2,2,3,3,4), brand=
  factor(c("A", "B", "C","B", "C","D", "B","D"), levels=lev), 
  thick=c(16,9,9,16,4,1,4,1))

edge <-  data.frame(ord1=c(1,1,2,3), brand1=factor(c("C","A","B","B"), 
  levels=lev),ord2=c(2,2,3,4), brand2=c("C","B","B","D"),
  N1=c(2,1,2,1), N2=c(5,5,2,1))

ggplot() +
  geom_point(data = nodes,
    aes(x = ord, y = brand, size = sqrt(thick)),
    color = "black", shape = 16, show.legend = T) + 
  scale_x_continuous(limits=c(1, 4), breaks=seq(0,4,1),
    minor_breaks = NULL) + 
  geom_segment(data = edge,
    aes(x = ord1, y = brand1, xend = ord2, yend = brand2), 
    color = "blue", size = edge$N2/edge$N1) +
  ylim(lev) +
  theme_bw()

I get this plot, as expected. enter image description here

I would like to add another legend (below the nodes) relating the width of the segments and N2/N1.

PD: Following some of your suggestions...

ggplot() + 
  geom_segment(data = edge,
    aes(x = ord1, y = brand1, xend = ord2, yend = brand2, size = N2/N1), 
    color = "blue", show.legend = T) +
  geom_point(data = nodes,
    aes(x = ord, y = brand, size = thick),
    color = "black", shape = 16, show.legend = T) + 
  scale_x_continuous(limits = c(1, 4), breaks = 0:4,
    minor_breaks = NULL) +
  scale_size_continuous(trans = "sqrt", breaks = c(1,4,9,16))  + 
  ylim(lev) +  theme_bw()

enter image description here

I got the legend but it overlaps with the other one.

I can try using colors instead of widths:

ggplot()+ geom_segment(data=edge, aes(x=ord1, y=brand1, xend=ord2, yend=brand2, alpha=N2/N1) , size=1 ,show.legend = T) +
  geom_point(data=nodes,aes(x=ord, y=brand, size=thick), color="black", shape=16,show.legend = T) + 
  scale_x_continuous(limits=c(1, 4), breaks=seq(0,4,1), minor_breaks = NULL) + scale_size_continuous(trans = "sqrt", breaks=c(1,4,9,16))  + 
   ylim(lev) + theme_bw()

enter image description here

Or varying alpha enter image description here

Though I prefer the original approach with widths because in my real plot I'll have many lines crossing.

PD: Any solution with lattice or any alternative able to be exported as svg or vectorial pdf?

PD2: I've found another problem, thin points aren't scaled properly and sometimes is impossible to force ggplot to show a proper legend: How can I force ggplot to show more levels on the legend?

skan
  • 7,423
  • 14
  • 59
  • 96
  • Don't square root the size in your data, do it in the scale: `scale_size_continuous(trans = "sqrt")`. This will solve your labeling issues (first bullet). Adding a separate legend for line thickness will be tougher... questions are better when they focus on single issues, not lots of stuff. – Gregor Thomas Oct 11 '17 at 19:25
  • 1
    1) 2) To have a legend you need to use aes to map the width 3) plot the points last so that they are on top of segments. – Pedro J. Aphalo Oct 11 '17 at 19:25
  • 1) An alternative to Gregor's answer is to use `scale_size_area()` which maps the data to the area of points instead of as by default to diameter, 2) a key for segment thickness will require at the very least to use `aes()` to map size of segments 3) plot the points last so that they are on top of segments. I suspect some of your questions must have been answered. The first and third are trivial, the one on segment size, could be useful to others. Check if an answer exists, and if not, write a new question for it, with a descriptive title. – Pedro J. Aphalo Oct 11 '17 at 19:38
  • @Gregor Where do I specify the variable to be transformed if there are several?. How I specify the limits of that numbers? Because I've tried and can only see from 4 to 16, I would like from 1 to 16 – skan Oct 11 '17 at 19:39
  • @PedroAphalo I prefer to transform it because I could also use logarithms or squares. – skan Oct 11 '17 at 19:40
  • @skan Using area when the shapes are solid matches our visual perception of size. Area is anyway proportional to square root of the normal mapping to diameter. You can of course apply also transformations to the area scale as to any other continuous scale. However, what you ask can be answered by looking at available tutorials and documentation. This site is meant for questions and answers of lasting value and usefulness. – Pedro J. Aphalo Oct 11 '17 at 19:48
  • I strongly suggest break this into two questions. Generally, anything mapped to the same aesthetic uses the same scale, with or without transformation. So you've got one question regarding scales for a single variable - transformations, ranges and such. (If you read `?continuous_scale` and `?scale_size_continuous`, all your answers are there for that one.) Then, you have a second, much harder question about wanting to map two different variables to `size`, one for points, one for lines, and have different legends for each. Maybe where only one of them has a transformation. That's question 2. – Gregor Thomas Oct 11 '17 at 19:49
  • OK, I've just left the main and most difficult question, how to create a legend with the width of the segments. – skan Oct 11 '17 at 19:51
  • @Gregor they already produce a plot, I just need to add the legend. – skan Oct 11 '17 at 19:53
  • @Gregor You are right in that the hardest part is getting around the problem of the size aesthetic being used in two different geoms to represent different quantities. This is against the philosophy behind the grammar of graphics, and made difficult if not impossible by the design of 'ggplo2'.. – Pedro J. Aphalo Oct 11 '17 at 19:59
  • For example, here are two questions for multiple legends for the same aesthetic. They both use different work-arounds, but I don't think either will work in your case. [example 1](https://stackoverflow.com/q/12567200/903061), [example 2](https://stackoverflow.com/a/9916295/903061). Instead, I think you'll have to do something [like this](https://stackoverflow.com/q/20129299/903061). – Gregor Thomas Oct 11 '17 at 19:59
  • @Gregor What about using guides?, but I don't know how. https://stackoverflow.com/questions/25007324/can-ggplot2-control-point-size-and-line-size-lineweight-separately-in-one-lege – skan Oct 11 '17 at 20:07
  • Would a base graphics solution be acceptable rather than ggplot? – dww Oct 11 '17 at 20:21
  • @dww yes it would as far as the output is nice and exportable to a vectorial format. Anyway I decided to do it with ggplot because it was supposed to be easier for this kind of things. – skan Oct 11 '17 at 20:23
  • Maybe using different segment colors instead of widths is going to be easier? – skan Oct 11 '17 at 20:23
  • Using color rather than width for the segments will be trivial, just change `size = N2/N1` to `color = N2/N1`. – Gregor Thomas Oct 11 '17 at 20:44
  • @Gregor, I've tried colour=hsv(N1/N2,1,1) inside the aes() but it says "invalid hsv color". I'm writing N1/N2 to get numbers smaller than 1. It would be great to get the colors of YlOrRd with reds for higher numbers. I've also tried + scale_color_brewer(palette="YlOrRd") but it says "Error: Continuous value supplied to discrete scale" – skan Oct 11 '17 at 20:56
  • Literally just do `color = N2/N1` inside `aes()`. Don't wrap it in `hsv()`. You can then add a *continuous* color scale. There are lots of example in `?scale_color_continuous`. The `scale_color_brewer` scales are discrete, the continuous analogs are in `scale_color_distiller`, so you could do `scale_color_distiller(palette = "YlOrRd")` – Gregor Thomas Oct 11 '17 at 21:30
  • @Gregor And I have the same problem with the name in the legend of the color, it has a formula. How can I get rid of it (without creating a new column on my database) ? something like scale_size_continuous for the colors legend? I want it to be N1/N2. – skan Oct 11 '17 at 21:48
  • When should I use the command guides()? – skan Oct 11 '17 at 21:48
  • https://stackoverflow.com/q/14622421/903061 – Gregor Thomas Oct 11 '17 at 21:54
  • I still would like to get the plot with line widths. Is it possible to create two plots, one with just nodes and its legend, another with just segments and its legend, and the overlap them on top of each other? – skan Oct 12 '17 at 17:48
  • @Gregor, how can I write the transformation if I want to multiply the size by 10?. (instead of trans = "sqrt") – skan Oct 12 '17 at 20:51

2 Answers2

4

Using a highly experimental package I put together:

library(ggplot2) # >= 2.3.0
library(dplyr)
library(relayer) # install.github("clauswilke/relayer")

# make aesthetics aware size scale, also use better scaling
scale_size_c <- function(name = waiver(), breaks = waiver(), labels = waiver(), 
          limits = NULL, range = c(1, 6), trans = "identity", guide = "legend", aesthetics = "size") 
{
  continuous_scale(aesthetics, "area", scales::rescale_pal(range), name = name, 
                   breaks = breaks, labels = labels, limits = limits, trans = trans, 
                   guide = guide)
}


lev <- c("A", "B", "C", "D")

nodes <- data.frame(
  ord = c(1,1,1,2,2,3,3,4),
  brand = factor(c("A", "B", "C", "B", "C", "D", "B", "D"), levels=lev), 
  thick = c(16, 9, 9, 16, 4, 1, 4, 1)
)

edge <- data.frame(
  ord1 = c(1, 1, 2, 3),
  brand1 = factor(c("C", "A", "B", "B"), levels = lev),
  ord2 = c(2, 2, 3, 4),
  brand2 = c("C", "B", "B", "D"),
  N1 = c(2, 1, 2, 1),
  N2 = c(5, 5, 2, 1)
)

ggplot() + 
  (geom_segment(
    data = edge,
    aes(x = ord1, y = brand1, xend = ord2, yend = brand2, edge_size = N2/N1), 
    color = "blue"
  ) %>% rename_geom_aes(new_aes = c("size" = "edge_size"))) +
  (geom_point(
    data = nodes,
    aes(x = ord, y = brand, node_size = thick),
    color = "black", shape = 16
  ) %>% rename_geom_aes(new_aes = c("size" = "node_size"))) + 
  scale_x_continuous(
    limits = c(1, 4),
    breaks = 0:4,
    minor_breaks = NULL
  ) +
  scale_size_c(
    aesthetics = "edge_size",
    breaks = 1:5,
    name = "edge size",
    guide = guide_legend(keywidth = grid::unit(1.2, "cm"))
  )  + 
  scale_size_c(
    aesthetics = "node_size",
    trans = "sqrt",
    breaks = c(1, 4, 9, 16),
    name = "node size"
  )  + 
  ylim(lev) + theme_bw()

Created on 2018-05-16 by the reprex package (v0.2.0).

Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
2

Sometimes ggplot may not be the best tool for the job. it pays to be familiar with some other plotting options for these instances, with R's base graphics system being reasonably versatile.

Here's how you might do it in base graphics:

lev <- c("A", "B", "C", "D")    
nodes <- data.frame(ord=c(1,1,1,2,2,3,3,4), brand=
    factor(c("A", "B", "C","B", "C","D", "B","D"), levels=lev), 
  thick=c(16,9,9,16,4,1,4,1))   
edge <-  data.frame(ord1=c(1,1,2,3), 
  brand1=factor(c("C","A","B","B"), levels=lev),
  ord2=c(2,2,3,4), 
  brand2=factor(c("C","B","B","D"), levels=lev),
  N1=c(2,1,2,1), N2=c(5,5,2,1))

png(width = 6, height = 4, units = 'in',res=300)
par(xpd=FALSE, mar = c(5, 4, 4, 15) + 0.1)
plot(NULL, NULL,  xaxt = "n", yaxt = "n",
  xlim = c(1,4), ylim = c(1,4), 
  xlab = 'ord', ylab = 'brand')
axis(side = 1, at = 1:4)
axis(side = 2, at = 1:4, labels = LETTERS[1:4])
grid()
par(xpd=TRUE)
segments(edge$ord1, as.integer(edge$brand1), 
  edge$ord2, as.integer(edge$brand2), 
  lwd = 4*edge$N2/edge$N1,
  col='blue')

points(nodes$ord, nodes$brand, cex=sqrt(nodes$thick), pch=16)
legend(4.5,4,
  legend = as.character(c(1,2,4,8,16)), 
  pch = 16, 
  cex = 1.5,
  pt.cex = sqrt(c(1,2,4,8,16)))
legend(6,4,
  legend = as.character(1:5), 
  lwd = 4*(1:5), 
  col = 'blue',
  cex = 1.5)
dev.off()

enter image description here

dww
  • 30,425
  • 5
  • 68
  • 111
  • 1
    @skan I changed your definition of `edge`, because you forgot to make `edge$ord2` a factor in the OP. So edge2 gets different levels to edge1. Try running my entire code, including the variable definitions – dww Oct 11 '17 at 23:45
  • PD: Any solution with lattice or any alternative able to be exported as svg or vectorial pdf? – skan Oct 24 '17 at 09:52
  • this method works fine for vector graphics format. Just use `svg()` or `pdf()` in place of the `png()` command – dww Oct 25 '17 at 02:53
  • How do you know what to write at cex and pt.cex? I want to use the points with NP=c(0.375 0.250 0.125 0.125 0.125). then I've plotted them as points(nodosord,factor(nodosord,factor(nodosord, factor(nodosevent), cex=10*(nodos$NP), pch=16). What should I use for legend? I've tried legend(4.5,4, legend = as.character(c(5,10,20,40)), pch = 16, cex = 1, pt.cex = (c(5,10,20,40))) but the width of the legend's points don't match the width of the plot's points. – skan Nov 16 '17 at 20:09
  • @skan its pretty hard to follow what you're asking in the comment format. Basically `cex` in plot should correspond to values of `pt.cex` in legend. it looks like your pt.cex vector is too big (ranging from 5 to 40) compared to points which range from 1.25 to3.75. – dww Nov 16 '17 at 20:25