5

I have a line plot with some time points that are hard to distinguish by the coloring alone and I would therefore like to label the time points on the plot, but the labels overlap (see plot below) in a way where it is hard to read the labels.

The plot currently look like this,

current plot

I wonder if there is a way to 'stack' the labels or some way (script) that can ensure they do not overlap. Something like this,

 - - >

Any help would be appreciated.

Here is the code I used to produce the plot,

 require(ggplot2)
 require(plyr)
 require(reshape)

# create sample data
set.seed(666)
dfn <- data.frame(
Referral  = seq(as.Date("2007-01-15"), len= 26, by="23 day"),
VISIT01  = seq(as.Date("2008-06-15"), len= 24, by="15 day")[sample(30, 26)],
VISIT02  = seq(as.Date("2008-12-15"), len= 24, by="15 day")[sample(30, 26)],
VISIT03  = seq(as.Date("2009-01-01"), len= 24, by="15 day")[sample(30, 26)],
VISIT04  = seq(as.Date("2009-03-30"), len= 24, by="60 day")[sample(30, 26)],
VISIT05  = seq(as.Date("2010-11-30"), len= 24, by="6 day")[sample(30, 26)],
VISIT06  = seq(as.Date("2011-01-30"), len= 24, by="6 day")[sample(30, 26)],
Discharge = seq(as.Date("2012-03-30"), len= 24, by="30 day")[sample(30, 26)],
Patient  = factor(1:26, labels = LETTERS),
openCase  = rep(0:1, 100)[sample(100, 26)])

 # set today's data for cases that do not have an Discharge date
 dfn$Discharge[ is.na(dfn$Discharge) ] <- as.Date("2014-01-30")

 mdfn <- melt(dfn, id=c('Patient', 'openCase'), variable_name = "Visit")
 names(mdfn)[4] <- 'Year' # rename 

 # order data in mdfn by 'Referral' in dfn
 mdfn$Patient <- factor(mdfn$Patient,levels = 
 (dfn$Patient[order(dfn$Referral)]),ordered = TRUE)

 # subset a dataset to avoid 'Discharge' for cases that are not closed 
 mdfn2 <- subset(mdfn,!(Visit=="Discharge" & Year > as.Date("2014-01-01")))

 # the plot as it looks now
 ggplot(mdfn, aes(Year, Patient)) +
     geom_blank() +
     geom_line(data = mdfn[mdfn$openCase == 0,], colour = "black") +
     geom_line(data = mdfn[mdfn$openCase == 1,], colour = "grey") +
     geom_point(data = mdfn2, aes(colour = Visit), size = 4, shape = 124) + 
     geom_text(data=mdfn2, mapping=aes(x=Year, y=Patient, 
     label=substr(Visit, 1, 7), colour=Visit), size=2, 
     vjust=-.4, hjust=-.1, angle = 00) 
Eric Fail
  • 8,191
  • 8
  • 72
  • 128
  • 1
    Missing functions, missing data, not reproducible. – IRTFM May 07 '12 at 05:16
  • I don't know how to do it, but do you really need labels? There's already a legend providing that information. – Ernest A May 07 '12 at 12:51
  • 1
    possible duplicate of [Intelligent point label placement in R](http://stackoverflow.com/questions/7611169/intelligent-point-label-placement-in-r) – joran May 07 '12 at 13:57
  • You could adapt the ideas I gave in http://stackoverflow.com/a/8318841/892313 to do this, but that may be more trouble than it is worth. Using color and/or shape to convey that information with the text in a legend is probably cleaner. – Brian Diggs May 07 '12 at 17:27
  • @DWin, I'm sorry. I forgot `require(reshape)` for the `melt`. It's in there now. – Eric Fail May 07 '12 at 17:54
  • @ErnestA, the thing is that I have approximately 15 levels in the original data and when I have the many levels it's really hard to distinguish the colors on the legend. – Eric Fail May 07 '12 at 17:55
  • @joran, I hadn't seen [this question, Intelligent point label placement in R](http://stackoverflow.com/questions/7611169/intelligent-point-label-placement-in-r), and your answer to it. Though I would like to keep the labels and have the process automated. Also, I saw the link to the post on [Cross Validated](http://stats.stackexchange.com/questions/16057/how-do-i-avoid-overlapping-labels-in-an-r-plot) regarding the question of how to [avoid overlapping labels in an R plot?](http://stats.stackexchange.com/questions/16057/how-do-i-avoid-overlapping-labels-in-an-r-plot). I'm reading them now … – Eric Fail May 07 '12 at 18:11
  • @BrianDiggs, that is some nice plots, though I hope to be able to 'just' lift some of my labels when the dates are within x weeks, or something like that. I have to read the links posted by joran. – Eric Fail May 07 '12 at 18:17
  • I don't expect many folks asking this type of question to find my answer at that question very satisfying. But that question is a pretty comprehensive list of automated attempts to do this sort of thing, hence my vote to close as a duplicate. – joran May 07 '12 at 18:23
  • @joran, I do see the similarities, but none of them solve the problem posted above. I hope to be able to find a solution and posted it here before this question is closed. Thanks, Eric – Eric Fail May 07 '12 at 19:35

1 Answers1

11

You can change the vertical location of the label according to the numeric value of Visit.

The key is:

 y=(as.numeric(Patient)+0.25*as.numeric(Visit)%%3)-0.12

This currently produces:
3 different levels according to values of Visit (%%3), which you can increase or decrease
each level is separated by a quarter of the distance between y labels (0.25)
the first label is 0.12 below the horizontal line
the second is 0.12 above

enter image description here enter image description here

require(ggplot2)
require(plyr)
require(reshape)
# create sample data
set.seed(666)
dfn <- data.frame(
  Referral  = seq(as.Date("2007-01-15"), len= 26, by="23 day"),
  VISIT01  = seq(as.Date("2008-06-15"), len= 24, by="15 day")[sample(30, 26)],
  VISIT02  = seq(as.Date("2008-12-15"), len= 24, by="15 day")[sample(30, 26)],
  VISIT03  = seq(as.Date("2009-01-01"), len= 24, by="15 day")[sample(30, 26)],
  VISIT04  = seq(as.Date("2009-03-30"), len= 24, by="60 day")[sample(30, 26)],
  VISIT05  = seq(as.Date("2010-11-30"), len= 24, by="6 day")[sample(30, 26)],
  VISIT06  = seq(as.Date("2011-01-30"), len= 24, by="6 day")[sample(30, 26)],
  Discharge = seq(as.Date("2012-03-30"), len= 24, by="30 day")[sample(30, 26)],
  Patient  = factor(1:26, labels = LETTERS),
  openCase  = rep(0:1, 100)[sample(100, 26)])

# set today's data for cases that do not have an Discharge date
dfn$Discharge[ is.na(dfn$Discharge) ] <- as.Date("2014-01-30")

mdfn <- melt(dfn, id=c('Patient', 'openCase'), variable_name = "Visit")
names(mdfn)[4] <- 'Year' # rename 

# order data in mdfn by 'Referral' in dfn
mdfn$Patient <- factor(mdfn$Patient,levels = 
  (dfn$Patient[order(dfn$Referral)]),ordered = TRUE)

# subset a dataset to avoid 'Discharge' for cases that are not closed 
mdfn2 <- subset(mdfn,!(Visit=="Discharge" & Year > as.Date("2014-01-01")))

# the plot as it looks now
ggplot(mdfn, aes(Year, Patient)) +
  geom_blank() +
  geom_line(data = mdfn[mdfn$openCase == 0,], colour = "black") +
  geom_line(data = mdfn[mdfn$openCase == 1,], colour = "grey") +
  geom_point(data = mdfn2, aes(colour = Visit), size = 4, shape = 124) + 
  geom_text(data=mdfn2, mapping=aes(x=Year, y=(as.numeric(Patient)+0.25*as.numeric(Visit)%%3)-0.12, 
                                    label=substr(Visit, 1, 7), colour=Visit), size=2, 
            hjust=-.1, angle = 00)
Etienne Low-Décarie
  • 13,063
  • 17
  • 65
  • 87
  • Elegant, impressive. Now I only need to figure out what to do when three time points are near each other and to avoid that the labels are placed on the line. Thanks. – Eric Fail May 10 '12 at 02:19
  • Text is no longer on line. If you need more different height levels, increase the number following %% and decrease the size of the shift (currently 0.25), which may require smaller text. – Etienne Low-Décarie May 10 '12 at 15:00
  • If this is not the actual data and this does not work with the actual data, think of providing the actual data after making it anonymous with: http://stackoverflow.com/a/10458688/742447 – Etienne Low-Décarie May 10 '12 at 15:04
  • @Eric D. Brean, how can I improve my response to your question? – Etienne Low-Décarie May 13 '12 at 06:25
  • Etienne Low-Décarie, thank you for getting back to me. I'm was trying to add the same function to x-axisis, but I must admit that I haven't fully understood how your add-on works. – Eric Fail May 13 '12 at 22:10
  • @Eric D. Brean, I am confused as to how or what you would want on the X-Axis – Etienne Low-Décarie May 14 '12 at 14:43
  • Dear Etienne, I'm trying to shift the text back and forth on the x-axis the same way the text is shifted up and down on the y-axis. I'll have a look at it later today. I want to thank you for being so diligent and following up on issue. Thanks, Eric – Eric Fail May 14 '12 at 16:08