0

I have a data set of species observed along a transect line. I have plotted this data in a bar graph in ggplot:

enter image description here

The problem is that I want to maintain the same spacing across all bars, even in species that are not observed at all treatment levels.

I have tried using geom_col(position=position_dodge(preserve = "single")) but it does not seem to work- it makes all the bars very slim and skews everything. Any other suggestions?

Here is my plotting code:

bar <- ggplot(data, aes(x =Species, fill=Treatment, y=pct)) +
       geom_col(position=position_dodge(0.8), width=0.8, color="black") +
       theme_bw() +
       scale_y_continuous(expand = c(0, 0), limits = c(0, 0.35))+
       ylab("Proportion of transects") +
       scale_fill_grey(start = 0, end = .9)
bar

And here is my data set:

data <- structure(list(Species = structure(c(3L, 13L, 5L, 3L, 5L, 13L, 
3L, 9L, 13L, 3L, 13L, 9L, 5L, 3L, 5L, 13L, 3L, 3L, 9L, 13L, 14L, 
3L, 13L, 3L, 9L, 3L, 13L, 9L, 14L, 3L, 13L, 3L, 3L, 9L, 13L, 
3L, 9L, 13L, 3L, 9L, 13L, 3L, 9L, 13L, 3L, 3L, 9L, 13L, 3L, 13L, 
14L, 3L, 13L, 3L, 9L, 3L, 13L, 14L, 9L, 3L, 13L, 3L, 13L, 3L, 
13L, 9L, 3L, 3L, 9L, 3L, 13L, 9L, 3L, 13L, 3L, 9L, 13L, 13L, 
9L, 3L, 3L, 13L, 9L, 3L, 3L, 13L, 9L, 13L, 3L, 3L, 13L, 9L, 2L, 
2L, 4L, 2L, 4L, 2L, 4L, 2L, 1L, 2L, 6L, 2L, 4L, 2L, 4L, 2L, 2L, 
10L, 4L, 2L, 1L, 2L, 2L, 4L, 2L, 2L, 4L, 2L, 6L, 2L, 2L, 2L, 
2L, 2L, 4L, 2L, 2L, 2L, 4L, 2L, 4L, 2L, 4L, 2L, 4L, 2L, 2L, 4L, 
2L, 1L, 2L, 2L, 2L), levels = c("ACOD", "CUNN", "DOL", "FLOU", 
"LOB", "LUMP", "POUT", "PRIA", "RCRAB", "SCUL", "SKATE", "SRAV", 
"STAR", "URCH"), class = "factor"), Treatment = structure(c(2L, 
2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 1L, 
2L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 2L, 
2L, 2L, 3L, 3L, 3L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 2L, 
2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 2L, 2L, 
2L, 3L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 
3L, 3L, 1L, 2L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 1L, 1L, 
2L, 2L, 3L, 3L, 1L, 2L, 2L, 2L, 3L, 3L, 1L, 2L, 2L, 2L, 3L, 3L, 
1L, 2L, 2L, 3L, 2L, 2L, 3L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 3L, 1L, 
2L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, 1L, 1L, 2L, 3L, 1L
), levels = c("Control", "Reef", "Reef Adjacent"), class = "factor"), 
    Count = c(36L, 35L, 5L, 36L, 4L, 22L, 36L, 24L, 21L, 36L, 
    35L, 24L, 5L, 36L, 4L, 22L, 36L, 36L, 24L, 35L, 18L, 36L, 
    22L, 36L, 24L, 36L, 35L, 24L, 18L, 36L, 22L, 36L, 36L, 24L, 
    35L, 36L, 21L, 22L, 36L, 24L, 35L, 36L, 21L, 22L, 36L, 36L, 
    24L, 21L, 36L, 35L, 18L, 36L, 22L, 36L, 24L, 36L, 35L, 18L, 
    24L, 36L, 22L, 36L, 21L, 36L, 35L, 24L, 36L, 36L, 24L, 36L, 
    35L, 24L, 36L, 22L, 36L, 24L, 21L, 35L, 24L, 36L, 36L, 22L, 
    21L, 36L, 36L, 35L, 24L, 22L, 36L, 36L, 21L, 24L, 35L, 34L, 
    14L, 22L, 4L, 35L, 16L, 34L, 5L, 22L, 1L, 35L, 16L, 34L, 
    14L, 22L, 35L, 2L, 16L, 34L, 5L, 22L, 35L, 16L, 34L, 35L, 
    16L, 34L, 2L, 22L, 35L, 34L, 22L, 35L, 14L, 34L, 22L, 35L, 
    16L, 34L, 14L, 35L, 16L, 34L, 14L, 35L, 34L, 14L, 22L, 1L, 
    35L, 34L, 22L), pct = c(0.333333333333333, 0.324074074074074, 
    0.0462962962962963, 0.333333333333333, 0.037037037037037, 
    0.203703703703704, 0.333333333333333, 0.222222222222222, 
    0.194444444444444, 0.333333333333333, 0.324074074074074, 
    0.222222222222222, 0.0462962962962963, 0.333333333333333, 
    0.037037037037037, 0.203703703703704, 0.333333333333333, 
    0.333333333333333, 0.222222222222222, 0.324074074074074, 
    0.166666666666667, 0.333333333333333, 0.203703703703704, 
    0.333333333333333, 0.222222222222222, 0.333333333333333, 
    0.324074074074074, 0.222222222222222, 0.166666666666667, 
    0.333333333333333, 0.203703703703704, 0.333333333333333, 
    0.333333333333333, 0.222222222222222, 0.324074074074074, 
    0.333333333333333, 0.194444444444444, 0.203703703703704, 
    0.333333333333333, 0.222222222222222, 0.324074074074074, 
    0.333333333333333, 0.194444444444444, 0.203703703703704, 
    0.333333333333333, 0.333333333333333, 0.222222222222222, 
    0.194444444444444, 0.333333333333333, 0.324074074074074, 
    0.166666666666667, 0.333333333333333, 0.203703703703704, 
    0.333333333333333, 0.222222222222222, 0.333333333333333, 
    0.324074074074074, 0.166666666666667, 0.222222222222222, 
    0.333333333333333, 0.203703703703704, 0.333333333333333, 
    0.194444444444444, 0.333333333333333, 0.324074074074074, 
    0.222222222222222, 0.333333333333333, 0.333333333333333, 
    0.222222222222222, 0.333333333333333, 0.324074074074074, 
    0.222222222222222, 0.333333333333333, 0.203703703703704, 
    0.333333333333333, 0.222222222222222, 0.194444444444444, 
    0.324074074074074, 0.222222222222222, 0.333333333333333, 
    0.333333333333333, 0.203703703703704, 0.194444444444444, 
    0.333333333333333, 0.333333333333333, 0.324074074074074, 
    0.222222222222222, 0.203703703703704, 0.333333333333333, 
    0.333333333333333, 0.194444444444444, 0.222222222222222, 
    0.324074074074074, 0.314814814814815, 0.12962962962963, 0.203703703703704, 
    0.037037037037037, 0.324074074074074, 0.148148148148148, 
    0.314814814814815, 0.0462962962962963, 0.203703703703704, 
    0.00925925925925926, 0.324074074074074, 0.148148148148148, 
    0.314814814814815, 0.12962962962963, 0.203703703703704, 0.324074074074074, 
    0.0185185185185185, 0.148148148148148, 0.314814814814815, 
    0.0462962962962963, 0.203703703703704, 0.324074074074074, 
    0.148148148148148, 0.314814814814815, 0.324074074074074, 
    0.148148148148148, 0.314814814814815, 0.0185185185185185, 
    0.203703703703704, 0.324074074074074, 0.314814814814815, 
    0.203703703703704, 0.324074074074074, 0.12962962962963, 0.314814814814815, 
    0.203703703703704, 0.324074074074074, 0.148148148148148, 
    0.314814814814815, 0.12962962962963, 0.324074074074074, 0.148148148148148, 
    0.314814814814815, 0.12962962962963, 0.324074074074074, 0.314814814814815, 
    0.12962962962963, 0.203703703703704, 0.00925925925925926, 
    0.324074074074074, 0.314814814814815, 0.203703703703704)), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -145L), groups = structure(list(
    Treatment = structure(1:3, levels = c("Control", "Reef", 
    "Reef Adjacent"), class = "factor"), .rows = structure(list(
        c(7L, 8L, 9L, 17L, 24L, 25L, 32L, 39L, 40L, 46L, 47L, 
        48L, 54L, 55L, 62L, 63L, 68L, 69L, 75L, 76L, 77L, 84L, 
        90L, 91L, 92L, 96L, 97L, 102L, 108L, 114L, 122L, 125L, 
        129L, 141L, 142L, 145L), c(1L, 2L, 3L, 10L, 11L, 12L, 
        13L, 18L, 19L, 20L, 21L, 26L, 27L, 28L, 29L, 33L, 34L, 
        35L, 41L, 42L, 49L, 50L, 51L, 56L, 57L, 58L, 59L, 64L, 
        65L, 66L, 70L, 71L, 72L, 78L, 79L, 80L, 85L, 86L, 87L, 
        93L, 98L, 99L, 103L, 104L, 105L, 109L, 110L, 111L, 115L, 
        116L, 118L, 119L, 123L, 126L, 130L, 131L, 134L, 135L, 
        138L, 143L), c(4L, 5L, 6L, 14L, 15L, 16L, 22L, 23L, 30L, 
        31L, 36L, 37L, 38L, 43L, 44L, 45L, 52L, 53L, 60L, 61L, 
        67L, 73L, 74L, 81L, 82L, 83L, 88L, 89L, 94L, 95L, 100L, 
        101L, 106L, 107L, 112L, 113L, 117L, 120L, 121L, 124L, 
        127L, 128L, 132L, 133L, 136L, 137L, 139L, 140L, 144L)), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • 1
    This is a duplicate: try `position = position_dodge2(preserve = "single")`. – Rui Barradas Aug 11 '23 at 15:08
  • As mentioned in my question I have already tried this and it did not work! – Jordan Woolfrey Aug 11 '23 at 18:00
  • You mention `position_dodge`, the answer to the question I believe is a duplicate uses `position_dodge2`. Check if it solves your problem, if it does not I will reopen. – Rui Barradas Aug 11 '23 at 19:12
  • In addition to making sure you use `position_dodge2`, you seem to have duplicate row in the data you provided. Try getting rid of the duplicates with `ggplot(unique(data), aes(x =Species, fill=Treatment, y=pct))` or combine them in some other way. When I do that I get the desired output https://i.stack.imgur.com/nMEnq.png – MrFlick Aug 11 '23 at 20:19
  • I tried using position_dodge2 and unfortunately it is still not working for me. @MrFlick could you post the code that you used to create that plot? – Jordan Woolfrey Aug 14 '23 at 11:51
  • 1
    Rui's solution and duplicate closure is correct. The problem you're having is that there are duplicate rows in your data. See `filter(data, Species == "URCH")` for example. You either need to use `stat_sum(geom = "bar", ...)`or use `distinct(data)` depending on which is correct. See [here](https://i.stack.imgur.com/P5euS.png). – Ian Campbell Aug 14 '23 at 12:57
  • This is just a subset of my data- I have other columns such as day of the year in the full dataset, making the data I posted not duplicates. In the full data set, each value is unique and I am still not able to use position_dodge2 – Jordan Woolfrey Aug 14 '23 at 13:12
  • Well if the data posted in the question is not sufficient to replicate the exact problem, then there might be something else going on. The code that worked for me with the sample data is: `ggplot(unique(data), aes(x =Species, fill=Treatment, y=pct)) + geom_col(position=position_dodge2(0.8, preserve = "single"), width=0.8, color="black") + theme_bw() + scale_y_continuous(expand = c(0, 0), limits = c(0, 0.35)) + ylab("Proportion of transects") + scale_fill_grey(start = 0, end = .9)` – MrFlick Aug 14 '23 at 13:47
  • Unfortunately that code didn't work for me with the full dataset. Strange. Thanks for your suggestions!! – Jordan Woolfrey Aug 14 '23 at 17:50

0 Answers0