0

I would like to reproduce the stacked histogram below (from Armstrong A., Microbiome, 2018). enter image description here

The plot per se is no problem, I can order my relative abundances with PcoA coordinates. My issue is I can't find a solution for labelling the top of each column/stack by some clinical data (here in this example, sexual Orientation and HIV). Is this with ggplot or something else? How do I plot the rows above? Thank you!

[Edit] Some data for trying:

Histogram (already ordered by PCoA)

Pt  Streptococcus   Staphylococcus  Lactobacillus   Acinetobacter   Pseudomonas Bacillus
patient1    61.20    5.65    7.45    1.65    0.30    0.60
patient6    43.00    2.10   18.10    0.40    0.60    0.60
patient5    41.95    4.10   24.55    0.75    0.90    0.00
patient8    41.15   25.95    3.50    0.20    7.45    0.30
patient4    26.45   55.10    2.55    3.40    0.05    2.85
patient7    18.20   26.40    0.95   20.25    0.50    0.05
patient3    18.00   18.70   38.55    0.10   56.55    0.00
patient2     0.35    0.05    2.10    0.20    0.40   94.75

Metadata to label on top

Pt Time
patient1    T1
patient2    T3
patient3    T4
patient4    T2
patient5    T2
patient6    T1
patient7    T1
patient8    T2
camcecc10
  • 47
  • 6
  • Could you please [prepare an example dataset](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so people can answer your question with some data under their hands? – utubun Sep 07 '21 at 09:16
  • You're totally right, sorry. I've added some, I hope it's enough (since I don't know what to do what I'm asking for...) – camcecc10 Sep 07 '21 at 09:33

1 Answers1

1

Ok this is a somewhat quick answer - had some trouble preparing your data so ignore the upfront preparation code as it is a little messy. But essentially I would recommend using the excellent patchwork library to create combined plots. The syntax is very simple and yet you can control output in a very fine-grained manner.

Load packages:

library(tidyverse)
library(patchwork)

Read data:

df <- read.csv2(text = "cPt,  Streptococcus,   Staphylococcus,  Lactobacillus,   Acinetobacter,   Pseudomonas, Bacillus,
patient1,    61.20,    5.65,    7.45,    1.65,    0.30,    0.60,
patient6,    43.00,    2.10,   18.10,    0.40,    0.60,    0.60,
patient5,    41.95,    4.10,   24.55,    0.75,    0.90,    0.00,
patient8,    41.15,   25.95,    3.50,    0.20,    7.45,    0.30,
patient4,    26.45,   55.10,    2.55,    3.40,    0.05,    2.85,
patient7,    18.20,   26.40,    0.95,   20.25,    0.50,    0.05,
patient3,    18.00,   18.70,   38.55,    0.10,   5.655,    0.00,
patient2,     0.35,    0.05,    2.10,    0.20,    0.40,   94.75", 
header = T, 
sep = ",", 
stringsAsFactors = FALSE
)

df2 <- read.csv2(text = "cPt, Time
patient1,    T1
patient2,    T3
patient3,    T4
patient4,    T2
patient5,    T2
patient6,    T1
patient7,    T1
patient8,    T2", 
header = T, sep = ","
)

Wrangle data - calculate other bacteria:

df %>% 
    select(-X) %>%
    mutate(other = as.character(100 - as.numeric(Streptococcus) - 
            as.numeric(Staphylococcus) - 
            as.numeric(Lactobacillus) - 
            as.numeric(Acinetobacter) - 
            as.numeric(Pseudomonas) - 
            as.numeric(Bacillus)))  %>%
    pivot_longer(-cPt) %>% 
    left_join(df2) -> df.complete 

Create bottom plot:

df.complete %>%
    ggplot(aes(x = cPt, y = as.numeric(value), fill = name)) + 
    geom_col(width = 1) + 
    theme_minimal() + 
    theme(legend.position = "bottom")  + 
    scale_fill_brewer(palette = "Set3") -> plot1

Create top plot:

df.complete %>%
    ggplot(aes(x = cPt, fill = Time)) + 
    geom_bar(width = 1) + 
    theme_void() + 
    theme(legend.position = "top") + 
    scale_fill_brewer(palette = "Set1") -> plot2

Combine plots with patchwork:

plot2 + plot1 + plot_layout(ncol = 1, heights = c(1, 10))

enter image description here

utubun
  • 4,400
  • 1
  • 14
  • 17
CMichael
  • 1,856
  • 16
  • 20