3

I am trying to label points in a scatterplot in R (ggplot2) using numbers (1, 2, 3, ...) and then match the numbers to names in a legend (1 - Alpha, 2 - Bravo, 3 - Charlie... ), as a way of dealing with too many, too long labels on the plot.

Let's assume this is a.df:

Name      X Attribute   Y Attribute  Size Attribute  Color Attribute
Alpha     1             2.5          10              A
Bravo     3             3.5          5               B
Charlie   2             1.5          10              C
Delta     5             1            15              D

And this is a standard scatterplot:

ggplot(a.df, aes(x=X.Attribute, y=Y.Attribute, size=Size.Attribute, fill=Colour.Attribute, label=Name)) +
   geom_point(shape=21) +
   geom_text(size=5, hjust=-0.2,vjust=0.2)

Is there a way to change it as follows?

  • have scatterplot points labeled with numbers (1,2,3...)
  • have a legend next to the plot assigning the plot labels (1,2,3...) to a.df$Name

In the next step I would like to assign other attributes to the point size and color, which may rule out some 'hacks'.

Henrik
  • 65,555
  • 14
  • 143
  • 159
Radoslav
  • 127
  • 1
  • 1
  • 5
  • 1
    If you post a representative data set and your attempted solution (code) it will greatly increase the likelihood of someone helping you with your question. See [this discussion](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – nrussell Jul 17 '14 at 11:18

2 Answers2

3

Here's an alternative solution, which draws the labels as geom_text. I've borrowed from ggplot2 - annotate outside of plot.

library(MASS)  # for Cars93 data
library(grid)
library(ggplot2)

d <- Cars93[1:30,]
d$row_num <- 1:nrow(d)
d$legend_entry <- paste("  ", d$row_num, d$Manufacturer, d$Model)

ymin <- min(d$Price)
ymax <- max(d$Price)
y_values <- ymax-(ymax-ymin)*(1:nrow(d))/nrow(d)

p <- ggplot(d, aes(x=Min.Price, y=Price)) +
        geom_text(aes(label=row_num)) +
        geom_text(aes(label=legend_entry, x=Inf, y=y_values, hjust=0)) +
        theme(plot.margin = unit(c(1,15,1,1), "lines"))

gt <- ggplot_gtable(ggplot_build(p))
gt$layout$clip[gt$layout$name == "panel"] <- "off"
grid.draw(gt)

R plot

Community
  • 1
  • 1
James Trimble
  • 1,868
  • 13
  • 20
  • Thank you, I think if I try adding geom_point() to the intial plot p, this could work. Alternatively, I will see if I can plotting two sets of shapes, offset against each other: 1) points that can be resized and coloured, 2) letters than could be assigned to the legend. – Radoslav Jul 17 '14 at 12:16
  • @james what method would you suggest to have the numerals and the text in different colors in the legend – CovetTachi Aug 22 '23 at 13:24
1

This is pretty hacky, but might help. The plot labels are simply added by geom_text, and to produce a legend, I've mapped colour to a label in the data. Then to stop the points being coloured, I override it with scale_colour_manual, where you can set the colour of the points, as well as the labels on the legend. Finally, I made the points in the legend invisible by setting alpha = 0, and the squares that are usually behind the dots in theme().

dat <- data.frame(id = 1:10, x = rnorm(10), y = rnorm(10), label = letters[1:10])
ggplot(dat, aes(x, y)) + geom_point(aes(colour = label)) + 
  geom_text(aes(x = x + 0.1, label = id)) +
  scale_colour_manual(values = rep("black", nrow(dat)),
                      labels = paste(dat$id, "=", dat$label)) +
  guides(colour = guide_legend(override.aes = list(alpha = 0))) +
  theme(legend.key = element_blank())

enter image description here

alexwhan
  • 15,636
  • 5
  • 52
  • 66
  • Thank you, good tip, but I would like to use size and colour for other attributes. I updated the question with sample data, apologies for not including that in the first place. – Radoslav Jul 17 '14 at 12:01