-1

I am summarising the evidence cited by different trials in their reference sections. I want to display the earliest and most recent cited papers on a chart along with the year of publication of the actual trial. I have tried solutions using ggplot, base plot function, and googleVis, but with no luck.

What I want is sort of like a Gantt chart, with the name of the trials on y-axis and the years (yyyy) on the x-axis. I've run into trouble because most Gantt chart code out there works on Dates, and also can't handle the three elements I need on the chart -

Earliest reference

Latest reference

Date of publication

poorly drawn postit of what I'm trying to achieve

Update: This is close to what I want, and this code works very well, thank you. I'm glad you did it in ggplot too, i'm used to that package.

I also need to add a third class (pubdate) onto the chart, so the df is

df <- structure(list(task = structure(1:3, .Label = c("Trial1", "Trial2", "Trial3"), 
                                  class = "factor"), start_year = c(1980, 2003, 2000),
                 end_year = c(2006, 2013, 2010), pub_date = c(2011, 2015, 2013)), 
            class = "data.frame",
            row.names = c(NA, 3L))

I would like pub_date to be separated from the start_year<->end_year line on the graph.

Djo
  • 24
  • 6
  • 4
    In order to receive the best response, I would recommend you provide a reproducible example and what you've tried so far. See [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for tips on how to make such an example. – Roman Luštrik Jan 29 '19 at 11:42

1 Answers1

1

This is a reproducible example of what you want. First you should have the start and end date of each task in your agenda, and the diferent tasks store as a factor in your data frame as follow.

  df <- structure(list(task = structure(1:3, .Label = c("Trial1", "Trial2", 
    "Trial3"), class = "factor"), start_year = c(1980, 2003, 2000
    ), end_year = c(2006, 2013, 2010), pub_date = c(2011, 2015, 2013
    )), class = "data.frame", row.names = c(NA, 3L))

Something inportant is to tidy your dates, using gather function from the tidyr package, for example. This way i put start and end years in the same column so it could be easier to plot by task.

   library(tidyverse)
   df %>% 
      gather(key = "start_end_date)", value = "year", -task, -pub_date) %>%
      ggplot(aes(x = year, y = task, color = task)) +
      geom_line(size = 2) + 
      geom_point(size = 3) + 
      geom_point(aes( x = pub_date), shape = 3, size = 3) +
      scale_x_continuous(breaks = seq(1980, 2016, 6))

enter image description here

Johan Rosa
  • 2,797
  • 10
  • 18
  • Thanks for this code, it was helpful, but unfortunately I couldn't get a third element (see updated question) added to the chart. I am new to R so this might be my inexperience. – Djo Feb 05 '19 at 17:33
  • Well, now I add a new geom_point with the pubdate for each trial. Anything else let me know. – Johan Rosa Feb 06 '19 at 13:35