grouping variables on y-axis using geom_segment in ggplot2

Question

I am using a segmented line plot and want to group variables on the y-axis based on a factor (in this case Patient ID). How can I change the width of the ticks on the y so that a patient is grouped by his/her ID, and only one label is given for each unique ID?

An example of my data and plot is below.

ggplot(data) +
  geom_segment(aes(x=age1, xend=age2, 
                   y=PatientID, yend=PatientID, colour=mortality)) + 
  scale_colour_manual(values=c("green", "red", "black"))

Data:

PatientID   age1     age2    mortality
11313          0        30        low
11313          31       50        low 
11313          51       65        med  
11313          0        10        med
11313          0        50        hi 
131NY          0        30        med
143CA          24       27        hi
165099         23       45        med
165099         46       55        hi 
165099         40       55        med

Welcome to Stack Overflow. Rather than pasting what your data looks like, please edit your question to include the data using the output of `dput()` - see [this very useful question](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for details and guidance. Something like `dput(head(data, 20))` should do the trick. Oh, and it's better not to call your data `data` as that is the name of a built-in function in R. — SlowLearner, Aug 26 '13 at 18:47

score 1 · Answer 1 · answered Aug 26 '13 at 19:10

1

I have used the sample data provided but the output seems to be similar to the desired output as described in the question. Patients are grouped by ID and have only one label each (see 11313 below). Am I missing something?

screenshot

library(ggplot2)

mytext <- "PatientID,age1,age2,mortality
11313,0,30,low
11313,31,50,low
11313,51,65,med
131NY,0,30,med
143CA,24,27,hi
165099,23,45,med
165099,46,55,hi"

dat <- read.table(textConnection(mytext), sep = ",",
                  check.names = FALSE,
                  strip.white = TRUE,
                  header = TRUE)

ggplot(dat) +
    geom_segment(aes(x = age1, xend = age2,
                     y = PatientID, yend = PatientID, colour = mortality)) +
    scale_colour_manual(values = c("green", "red", "black"))

answered Aug 26 '13 at 19:10

SlowLearner

7,907
11
49
80

My fault, data should have also had overlapping lines e.g. PatientID=11313; age1=0; age2=10; mortality=low. These are plotted over one another if using a unique ID, and if I alter the ID to reflect the line (e.g. 11313 and 11313a) they are plotted at a given distance along the y. – user2719033 Aug 26 '13 at 19:43
Well, this is precisely why paying attention to providing a proper data subset using `dput` or similar methods is important. Please edit your question to reflect the data you want us to actually work on and we will have a chance of helping you. – SlowLearner Aug 26 '13 at 20:22
Data is amended to reflect the "overlapping" line segments. – user2719033 Aug 26 '13 at 22:08

grouping variables on y-axis using geom_segment in ggplot2

1 Answers1