Dodge geom_errorbar by group when there are missing values

Question

I am trying to plot different points, some of which are observations (therefore with no error bars), others are predictions (with error bars). I used position_dodge to place my points, but because there are missing values for error bars, I cannot find a way to match the error bars with their respective points.

Below I tried to make a simple reproducible example inspired from my dataset.

a <- data.frame(taxon = "plants", type = c(rep("observation", 3), "prediction"), period = c("1970-2017", "2000-2009", "2010-2017", "2017"), value = 1:4, lwr = c(NA, NA, NA, 3.5), upr = c(NA, NA, NA, 4.5))

a
#>    taxon        type    period value lwr upr
#> 1 plants observation 1970-2017     1  NA  NA
#> 2 plants observation 2000-2009     2  NA  NA
#> 3 plants observation 2010-2017     3  NA  NA
#> 4 plants  prediction      2017     4 3.5 4.5

This is the code I used for ggplot:

ggplot(a) +
  geom_point(aes(x = taxon, 
                 shape = type,
                 y = value,
                 col = period),
             position = position_dodge(width = .5)) +
  geom_errorbar(aes(x = taxon, 
                    ymin = lwr, ymax = upr),
                position = position_dodge(width = .5))

As you can see, the error bar is centered, most likely because the missing values in lwr and upr have been omitted, whereas it should be on the top right point. All my attempts to fix this (i.e., different settings with position_dodge, try to specify the preserve argument) have been unsuccessful so far, and I have not been able to find help on internet.

score 2 · Accepted Answer · edited Jan 30 '20 at 15:38

2

This is probably not the most elegant solution but you could use geom_pointrange instead and make the upr and lwr values the same as your value column so they get plotted without error bars.

e.g.

    a$lwr <- ifelse(is.na(a$lwr), a$value, a$lwr)
    a$upr <- ifelse(is.na(a$upr), a$value, a$upr)

    ggplot(a) +
      geom_pointrange(aes(x=taxon, y=value, ymin=lwr, ymax=upr, shape=type, col=period), 
                  position = position_dodge(width = 0.5)) +
      theme_bw()

This gives this graph, which sounds like what you want: Pointrange Example

edited Jan 30 '20 at 15:38

tjebo

21,977
7
58
94

answered Jan 30 '20 at 10:11

A.Elsy

128
1
7

This is a great workaround. The only issue is that legends are altered here, but it is a minor one which can probably be corrected with override.aes. I wait to see if there is a more "official" method before I mark your answer as the solution. – Boris Leroy Jan 30 '20 at 10:25
Cheers, did you manage to sort out the legend issue? If not perhaps you could use grid.arrange to extract the correct legend from your original plot and combine it with this plot? See this question: https://stackoverflow.com/questions/13649473/add-a-common-legend-for-combined-ggplots – A.Elsy Jan 31 '20 at 13:54
Yes, I ended up using your solution in the end! ;) Worked very fine for me. – Boris Leroy Jan 31 '20 at 16:52

tjebo · Answer 2 · 2020-01-30T15:41:14.587

2

I'd probably not rely on dodging by group, just change your x variable to interaction(taxon,period). Then you can remove the dodge, and it will look like that:

a <- data.frame(taxon = "plants", type = c(rep("observation", 3), "prediction"), period = c("1970-2017", "2000-2009", "2010-2017", "2017"), value = 1:4, lwr = c(NA, NA, NA, 3.5), upr = c(NA, NA, NA, 4.5))

library(ggplot2)
ggplot(a) +
  geom_point(aes(x = interaction(taxon, period), shape = type, y = value, col = period)) + 
  geom_errorbar(aes(x = interaction(taxon, period), ymin = lwr, ymax = upr))
#> Warning: Removed 3 rows containing missing values (geom_errorbar).

^{Created on 2020-01-30 by the reprex package (v0.3.0)}

edit as per comment

If you've got more than one taxon, a very neat way would be too separate by facet.

a <- data.frame(taxon = "plants", type = c(rep("observation", 3), "prediction"), period = c("1970-2017", "2000-2009", "2010-2017", "2017"), value = 1:4, lwr = c(NA, NA, NA, 3.5), upr = c(NA, NA, NA, 4.5))
b <- data.frame(taxon = "plants_b", type = c(rep("observation", 3), "prediction"), period = c("1970-2017", "2000-2009", "2010-2017", "2017"), value = 1:4, lwr = c(NA, NA, NA, 3.5), upr = c(NA, NA, NA, 4.5))

library(ggplot2)
ggplot(rbind(a,b)) +
  geom_point(aes(x = interaction(taxon, period), shape = type, y = value, col = period)) + 
  geom_errorbar(aes(x = interaction(taxon, period), ymin = lwr, ymax = upr)) +
  facet_grid(~taxon, scales = 'free_x') +
  theme(axis.text.x = element_blank())
#> Warning: Removed 6 rows containing missing values (geom_errorbar).

^{Created on 2020-01-30 by the reprex package (v0.3.0)}

I have included free x scales and removed the x-labels because it doesn't contain additional information which is not already included in the facet title or the color legend

edited Jan 30 '20 at 15:41

answered Jan 30 '20 at 11:52

tjebo

21,977
7
58
94

Yes, I thought about that too, but I have an issue with this workaround because the actual dataset is large, with about 16 taxons being compared. Hence, the interactions makes the X axis difficult to set up nicely. – Boris Leroy Jan 30 '20 at 13:28
@BorisLeroy you could separate those different taxons by facet !! this makes it easy to differentiate and the graph neat, see updated answer – tjebo Jan 30 '20 at 15:30
@BorisLeroy please definitely consider accepting A.Elsy's answer - they deserve som rep:) – tjebo Jan 30 '20 at 15:47
1

thanks for your additional comments. Yes you are right that the graph is really nice with facets, your answer is correct. However, the taxons are also nested within larger taxons on the X axis (e.g.: plants = [taxon a, taxon b]; vertebrates = [taxon c, taxon d], and I had plans to make facets for the larger taxons already. Nevertheless, you were on point! It's just that I provided an overly simplified example to focus on the position_dodge problem. I mark A.Elsy's answer as the solution because I used it, but I am certain your answer will be useful to others as well! – Boris Leroy Jan 31 '20 at 09:29
@borisleroy - from what you said I am not sure if you wanted to accept my answer but you apparently have marked mine ... it’s really alright for me not to have the accepted answer and I thank you for you’re kind words on my answer :) – tjebo Jan 31 '20 at 16:29
1

You are right I was too quick! I thought I had marked A.Elsy as the solution but I did not. I sorted it out! ;) – Boris Leroy Jan 31 '20 at 16:51

Dodge geom_errorbar by group when there are missing values

2 Answers2

Related