3

Question

I have a dataset, se.df (data at bottom of question), that I'm visualising as a factored gantt chart through the use of ggplot and facet_grid. However, the y-labels are not ordered as I've specified to aes

library(ggplot2)
base <- ggplot(
  se.df,
  aes(
    x = Start.Date, reorder(Action,Start.Date), color = Comms.Type
  ))
base + geom_segment(aes(
  xend = End.Date,ystart = Action, yend = Action
), size = 5) + 
facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)

In this detail image you can see that there are bars that are:

  1. Not shown in Start.Date order
  2. Not ordered by Action. To clarify, bars should be ordered by Start.Date and then alphabetically by Action

enter image description here

How can I order the bars within each factor according to Start.Date and then by Action?

Update

@heathobrien provided a solution that solves my problem of ordering bars by Start.Date other than an issue arising from duplicate factors - which is something that my actual data has.

There are two instances of "Inform colleges" in Action, which result in a misordering in the following code from @heathobrien, highlighted in the image with a dashed red oval:

se.df <-se.df[order(se.df$Start.Date,se.df$Action),]
se.df$Action <- factor(se.df$Action, levels=unique(se.df$Action))
ggplot(se.df, aes(x = Start.Date, color = Comms.Type)) +
  geom_segment(aes(xend = End.Date, y = Action, yend = Action), size = 5) +
  facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)

enter image description here

How can this data.frame be provided to ggplot such that ordering is consistent within each facet_grid?

Further Detail

There are a lot of questions about making Gantt charts and ordering factors, I've made a few decisions based on other's answers:

  1. geom_segment

Many questioners have used geom_linerange but suffer from it not being possible to use coord_flip with non-cartesian coordinate systems. Solutions to this are complicated and I've mitigated these with geom_segment.

  1. reorder within aes

The almost canonical bar ordering question uses reorder. However, this does not work for my data, even if using transform rather than specifying order to aes directly. I would be very happy to find any solution that worked.

Data

se.df <- structure(list(Source = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L), .Label = c("a", "b", "c"), class = c("ordered", 
"factor")), Action = structure(c(21L, 30L, 19L, 27L, 16L, 17L, 
18L, 13L, 12L, 3L, 1L, 8L, 4L, 21L, 20L, 27L, 15L, 17L, 18L, 
14L, 26L, 2L, 8L, 5L, 22L, 26L, 2L, 8L, 5L, 22L, 22L, 11L, 7L, 
24L, 29L, 6L, 23L, 25L, 25L, 10L, 28L, 9L), .Label = c("Add OA \"Act on Acceptance\" to websites", 
"Add RDM liaison presece to divisional and departmental websites", 
"All-staff message from VC and/or Pro-VC (Research)", "Arrange OA Briefing for every department", 
"Arrange RDM Briefing for every department", "Brief Communication Officers Network", 
"Brief Conference of Colleges", "Brief divisional board/commitees", 
"Brief Faculty IT Officers", "Brief Research Committee", "Brief Senior Tutors", 
"Brief/mobilise internal comms officers", "Brief/mobilise ORFN", 
"Brief/mobilise Subject Librarians", "Ceate template slides for colleagues to use in delivering RDM Briefings", 
"Create template slides for colleagues to use in delivering OA Briefings", 
"Create template text & icon for use on websites", "Draft material for use in staff induction", 
"Ensure webpages for ORA & Symplectic Elements  are updated & consistent", 
"Ensure webpages for ORA-Data are updated & consistent", "Finalise key messages and draft campaign text", 
"Inform colleges ", "Inform Heads of Departments and Research Directors", 
"Present at Departmental Administrator's Meeting", "Present at HAF meeting", 
"Present at UAS Conference", "Produce hard copy materials to promote message ", 
"Update Divisional Board", "Update Library Committee (CLIPS)", 
"Update OAO website content for HEFCE/REF"), class = "factor"), 
    Start.Date = structure(c(1435705200, 1435705200, 1438383600, 
    1441062000, 1441062000, 1441062000, 1441062000, 1444518000, 
    1444518000, 1425168000, 1420070400, 1444518000, 1444518000, 
    1441062000, 1441062000, 1441062000, 1441062000, 1441062000, 
    1441062000, 1438383600, 1441062000, 1420070400, 1444518000, 
    1444518000, 1443654000, 1441062000, 1420070400, 1444518000, 
    1444518000, 1443654000, 1441062000, 1444518000, 1449273600, 
    1444518000, 1444518000, 1445036400, 1441062000, 1443740400, 
    1443740400, 1443740400, 1447459200, 1443740400), class = c("POSIXct", 
    "POSIXt"), tzone = ""), End.Date = structure(c(1440975600, 
    1440975600, 1443567600, 1443567600, 1443567600, 1443567600, 
    1443567600, 1449273600, 1449273600, 1430348400, 1446249600, 
    1449273600, 1449273600, 1446249600, 1446249600, 1443567600, 
    1443567600, 1443567600, 1443567600, 1443567600, 1443567600, 
    1443567600, 1449014400, 1449014400, 1451520000, 1443567600, 
    1443567600, 1449014400, 1449014400, 1451520000, 1443567600, 
    1449014400, 1449619200, 1449014400, 1449014400, 1446249600, 
    1446249600, 1449792000, 1449792000, 1449792000, 1447804800, 
    1449792000), class = c("POSIXct", "POSIXt"), tzone = ""), 
    Comms.Type = structure(c(3L, 7L, 7L, 6L, 5L, 7L, 8L, 4L, 
    4L, 2L, 7L, 1L, 1L, 3L, 7L, 6L, 5L, 7L, 8L, 4L, 5L, 7L, 1L, 
    1L, 5L, 5L, 7L, 1L, 1L, 5L, 1L, 1L, 1L, 5L, 3L, 1L, 1L, 5L, 
    5L, 1L, 1L, 1L), .Label = c("Briefing", "Email", "Mixed Media", 
    "Mobilisation", "Presentations", "Printed Materials", "Website", 
    "Workshop"), class = "factor")), .Names = c("Source", "Action", 
"Start.Date", "End.Date", "Comms.Type"), row.names = c(NA, -42L
), class = c("tbl_df", "tbl", "data.frame"))
Community
  • 1
  • 1
  • I think if you sort your dataframe by start.date before plotting that will sort you out – heathobrien Sep 01 '15 at 12:24
  • Just to clarify: you want to order Action alphabetically or by Comms.Type? – Felix Sep 01 '15 at 12:35
  • @Felix alphabetically is the goal, sorry for omitting that. heathobrien I'll check when back at my machine – Charlie Joey Hadley Sep 01 '15 at 12:37
  • @heathobrien the following `se.df <-se.df[order(se.df$Start.Date,se.df$Action),]` does not resolve the mis-sorting shown in my image. Removing the `reorder(Action,Start.Date` line from within `aes` with this reordering DOES reorder `Action` alphabetically but doesn't order by `Start.Date` in the `ggplot` output. – Charlie Joey Hadley Sep 01 '15 at 12:50

3 Answers3

3

I think this is what OP is looking for:

enter image description here

I had to create a synthetic taskID to pass the order (by increasing Start.Date, alpha by Action). By the way, if you want to order alphabetically by Action, you'll need to change the order of factors or convert to a char.

# first let's order the DF the way we want it to appear 
#    (higher taskID's first)

# dplyr-free version
se.df$Action <- as.character(se.df$Action)  
se.df <- se.df[order(se.df$Start.Date, se.df$Action), ]
se.df$taskID <- as.factor(nrow(se.df):1)


library(ggplot2)
ggplot(se.df, aes(x = Start.Date, y=taskID, color = Comms.Type)) +
  scale_y_discrete(breaks=se.df$taskID, labels = se.df$Action) + 
  geom_segment(aes(xend = End.Date, y = taskID, yend = taskID), size = 5) +
  facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)
C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
  • That's great! You handled the offending "Inform colleges" duplicate, thank you. There is a lot for me to go away and learn (the `%>% syntax and `dplyr` package are new to me), but that is very much on me to do :) I don't understand why my question got a downvote, though. I felt my example was minimal and repeatable and I collated resources enough for others experiencing similar problems. Anyway, thank you for your time. – Charlie Joey Hadley Sep 01 '15 at 14:21
  • 1
    @MartinJohnHadley changed it to base R only. Note that`dplyr::arrange` uses C++ to alphabetize, so with `order` the 'Inform colleges' now appears ahead of 'Inform Heads...' Dplyr wasn't a huge help in this case but has saved me on many occasions. – C8H10N4O2 Sep 01 '15 at 14:29
2

Once you've sorted the dataframe in the order you want, you should be able to use that as the levels for your factor:

se.df <-se.df[order(se.df$Start.Date,se.df$Action),]
se.df$Action <- factor(se.df$Action, levels=unique(se.df$Action))
ggplot(se.df, aes(x = Start.Date, color = Comms.Type)) +
  geom_segment(aes(xend = End.Date, y = Action, yend = Action), size = 5) +
  facet_grid(Source ~ .,scale = "free_y",space = "free_y", drop = TRUE)

enter image description here

heathobrien
  • 1,027
  • 7
  • 11
  • Thanks for answer, unfortunately there are still the two issues in the `Source` "b". 1) The "ensure webpages" item is displayed before the "finalise key messages" bar, despite a later start date, 2) many events start on 2015-09-01, but weirdly the "Inform colleges" bar appears on top of "produce hard copy" despite being alphabetically before it. Sorry my data has these long entires, makes it difficult to talk about :( – Charlie Joey Hadley Sep 01 '15 at 13:55
  • ... I wrote that before evaluating the updated answer on my machine, I do now have the ordering that I need - thanks! Do you want me to upload another image for you, save you the trouble? – Charlie Joey Hadley Sep 01 '15 at 13:58
  • These problems are because you have the same action name in there several times with different start times. I'm not sure what the solution to that is (other than renaming them) because you can't have two factor levels with the same name – heathobrien Sep 01 '15 at 13:59
  • sure. that would be most helpful – heathobrien Sep 01 '15 at 14:00
  • Thanks for your answer - it will be useful in cases where I can guarantee unique entries. @C8H10N4O2 answer was able to overcome the duplicate issue, and is forcing me to learn this %>% insanity. – Charlie Joey Hadley Sep 01 '15 at 14:19
2

This question was asked in 2015, before the advent of the tidyverse and the excellent forcats library.

Here's a tidyverse solution to the problem:

Use arrange to order the data by Start.Date and Action, and then create a task_id using row_number()

library("tidyverse")
se.df <- se.df %>%
  arrange(desc(Start.Date), Action) %>%
  mutate(task_id = row_number())

Use fct_reorder to convert Action to a factor ordered by task_id.

se.df <- se.df %>%
  mutate(
    Action = fct_reorder(Action, task_id),
    Action = fct_rev(Action)
  )

Now we can chart this data without having to replace axes labels:

se.df %>%
  ggplot(aes(x = Start.Date, y = Action, color = Comms.Type)) +
  geom_segment(aes(xend = End.Date, y = Action, yend = Action), size = 5) +
  facet_grid(Source ~ ., scale = "free_y", space = "free_y", drop = TRUE)