0

I would really appreciate some help with this plot. I'm very new to R and struggling (after looking at many tutorials!)to understand how to plot the following:

This is my Table The X axis is meant to have PatientID, the Y is cell counts for each patient

I've managed to do a basic plot for each variable individually, eg:

This is for 2 of the variables

And this gives me 2 separate graphs Total cell counts Cells counts for zone 1

I would like all the data represented on 1 graph...That means for each patient, there will be 4 bars (tot cell counts, and cell counts for each zone (1 - 3).

I don't understand whether I should be doing this as a combined plot or make the 4 different plots and then combine them together? I'm also very confused with how to actually code this. I've tried ggplot and I've done the regular Barplot in R (worked for 1 variable at a time but not sure how to do many variables). Some very step-by-step help would be so much appreciated here. TIA

Sk8erPhD
  • 1
  • 1
  • 2
    Please do not post code or data as images. When you do, we cannot copy/paste those into R to run and test the code. It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Sep 29 '21 at 22:56
  • 1
    there are many examples for making basic barplots including the examples in `?barplot`, did you try something like `barplot(t(tot_cells_table[, 2:3]), beside = TRUE)` – rawr Sep 29 '21 at 23:09
  • @MrFlick - yes, good point. Thank you, will do! – Sk8erPhD Sep 30 '21 at 09:13
  • @rawr - Thanks for the idea! I made it work with this in mind - see my answer below. Thanks again! – Sk8erPhD Sep 30 '21 at 09:13
  • Please provide enough code so others can better understand or reproduce the problem. – Community Oct 02 '21 at 18:57

2 Answers2

0

Here's a way of doing it using the ggplot2 and tidyr packages from the tidyverse. The key steps are pivoting your data from "wide" to "long" format in order to make it usable with ggplot2. Afterwards, the ggplot call is pretty simple - more info here if you want a bit more explanation about stacked and bar plots in ggplot2, with an example that's pretty much identical to yours.

library(ggplot2)
library(tidyr)

# Reproducing your data
dat <- tibble(
  patientID = c("a", "b", "c"),
  tot_cells = c(2773, 3348, 4023),
  tot_cells_zone1 = c(994, 1075, 1446),
  tot_cells_zone2 = c(1141, 1254, 1349),
  tot_cells_zone3 = c(961, 1075, 1426)
)

to_plot <- pivot_longer(dat, cols = starts_with("tot"), names_to = "Zone", values_to = "Count")

ggplot(to_plot, aes(x = patientID, y = Count, fill = Zone)) +
  geom_bar(position="dodge", stat="identity")

Output:

Stacked barplot of OP's data

Rory S
  • 1,278
  • 5
  • 17
  • Thank you for that code! Yes I was trying to use ggplot2 as well but couldn't quite get the syntax. I've made it work now with Barplot() as in my answer below. However, I will also try the ggplot2 approach. Is there an advantage to use one over the other? thanks again . – Sk8erPhD Sep 30 '21 at 09:15
  • No worries! I prefer ggplot2 for a few reasons; I think the syntax is much neater once you're used to it, it works well if you use other tidyverse packages (which I do, and would recommend), and there are several other packages that add extra functions that make producing complex/specialist graphs with ggplot quite easy. – Rory S Sep 30 '21 at 09:31
  • 1
    Awesome thanks. Will def practice with ggplot – Sk8erPhD Sep 30 '21 at 12:24
0

Thanks everyone for your help. I was able to make the plot as follows:

First, I made a new table from data I imported into R:

#Make new table of patientID and tot cell count
patientID <- c("a", "b", "c")
tot_cells <- c(tot_cells_a, tot_cells_b, tot_cells_c)
tot_cells_zone1 <- c(tot_cells_a_zone1, tot_cells_b_zone1, tot_cells_c_zone1)
tot_cells_zone2 <- c(tot_cells_a_zone2, tot_cells_b_zone2, tot_cells_c_zone2)
tot_cells_zone3 <- c(tot_cells_a_zone3, tot_cells_b_zone3, tot_cells_c_zone3)
tot_cells_table <- data.frame(tot_cells, 
                          tot_cells_zone1, 
                          tot_cells_zone2, 
                          tot_cells_zone3)
rownames(tot_cells_table) <- c(patientID)

Then I plotted as such, first converting the data.frame to matrix :

#Plot "Total Microglia Counts per Patient"

tot_cells_matrix <- data.matrix(tot_cells_table, rownames.force = patientID)

par(mar = c(5, 4, 4, 10), 
xpd = TRUE)

barplot(t(tot_cells_table[1:3, 1:4]), 
    col = c("red", "blue", "green", "magenta"),
    main = "Total Microglia Counts per Patient", 
    xlab = "Patient ID", ylab = "Cell #",
    beside = TRUE)

legend("topright", inset = c(- 0.4, 0),
   legend = c("tot_cells", "tot_cells_zone1", 
              "tot_cells_zone2", "tot_cells_zone3"), 
   fill = c("red", "blue", "green", "magenta"))

And the graph looks like this: Barplot of multiple variables

Thanks again for pointing me in the right direction!

Sk8erPhD
  • 1
  • 1