0

I am trying to fit a area plot that displays the number of visitors for different events over time. My issue is that for the case when one of the events ends at exactly the same point in time when another one starts the graph gets messed up.

The following code produces a plot where the areas show the true number of visitors, but it leaves blank spaces.

df1 <- data.frame(time = c(1,2,3,3,4,5),
                  visitors = rep(3,6),
                  type = c(rep("A",3),
                           rep("B",3)))

ggplot(data = df1, aes(x = time, y = visitors, fill = type)) +
  geom_area(stat = "identity")

Plot with blank polygons:

Plot with blank polygons

However, this leaves blank areas. I know from R: stacked geom_area plot displays blank polygons that this area can be filled by adding data that explicitly specifies the zeros:

df2 <- data.frame(time = rep(1:5, 2),
                  visitors = c(3,3,3,0,0,
                               0,0,3,3,3),
                  type = c(rep("A",5),
                           rep("B",5)))

ggplot(data = df2, aes(x = time, y = visitors, fill = type)) +
  geom_area(position = "stack")

Plot with blank polygons:

Plot after explicitly adding zeros to data

Unfortunately, this falsifies the data by showing visitors for event B before 3 and for event A after three. I know from geom_area produces blank areas between layers that this can to some extent be dealt with by using position = "dodge", however the problem remains to some extent, as there are still visitors displayed for event B before 3:

Plot after using position = "dodge":

Plot after using position = "dodge"

Edit 1: The plot at the end should look like this:

Plot that would look like what I'm after:

Edit 2: I just realized that the code I used for the image above, is what I'm actually looking for.

df3 <- data.frame(start = c(1,6,3),
                    end   = c(6,9,7),
                    visitors = c(3,4,2),
                    type = c("A", "B", "C"))

  df3 <- df3[rep(rownames(df3), df3$end-df3$start),]

  # creates a vectorized seq() function
  seq.vector <- Vectorize(seq.default, vectorize.args = c("from", "to"))

  # creates a data point between every two time points of each event
  df3$time <- unlist(seq.vector(unique(df3)$start + 0.5, unique(df3)$end - 0.5, 1))

  ggplot(data = df3, aes(x = time, y = visitors, fill = type)) +
    geom_bar(stat = "identity", width = 1)

So thanks to teunbrand for asking the right question.

dropout
  • 17
  • 5
  • What is your ideal output? – yusuzech Oct 08 '19 at 21:22
  • Sorry i forgot to specify: I'm looking for a plot that displays the two areas as rectangulars next to each other. I.e. the same as the third plot using `position_dodge` but with the border between the two areas being a straight vertical line at 3. – dropout Oct 08 '19 at 22:36
  • 1
    Are you looking for `position = "identity"` instead of the default? – teunbrand Oct 09 '19 at 05:56

0 Answers0