6

I have a data frame with the names of articles, number of samples overall for each article, number of responders for a drug and number of non-responders. All together there are 9 articles:

Articles <- c("Nadeem Riaz", "David Braun","immotion150", "IMVIGOR210", "Alexander Lozano",
             "Van Allen", "Alexandra Pender", "David Lui", "Jae Cho")
Samples_number <- c(49, 311, 247, 298, 47, 39, 82, 121, 16)
With_Benefit <- c(26, 89, 168, 131, 27, 13,17,65, 5)
No_Benefit <- c(13, 102, 79, 167, 20, 26, 65, 56, 11)

MyData <- data.frame(Articles, Samples_number, With_Benefit, No_Benefit)

I need to make a bar chart with the Articles Names on the x axis, the overall samples number on the y axis, and color each bin so that for example responders would be blue and non-responders red, for each article.

I built a bar chart I just don't know what to type in the fill segment: (here I just typed the No_Benefit column but I know it's a wrong fill)

myplot <- ggplot(MyData, aes(x = Articles, y = Samples_number, fill= No_Benefit)) + theme_bw() + geom_col(position = "stack")
print(myplot)
Programming Noob
  • 1,232
  • 3
  • 14

2 Answers2

2

I believe the main problem is with the format of the dataframe; your data is in 'wide' format, but ggplot2 works a lot better with 'long' format. You can pivot your dataframe from 'wide' to 'long' with the pivot_longer() function from the tidyr package, e.g.

library(ggplot2)
library(tidyr)

Articles <- c("Nadeem Riaz", "David Braun","immotion150", "IMVIGOR210", "Alexander Lozano",
              "Van Allen", "Alexandra Pender", "David Lui", "Jae Cho")
Samples_number <- c(49, 311, 247, 298, 47, 39, 82, 121, 16)
With_Benefit <- c(26, 89, 168, 131, 27, 13,17,65, 5)
No_Benefit <- c(13, 102, 79, 167, 20, 26, 65, 56, 11)

MyData <- data.frame(Articles, Samples_number, With_Benefit, No_Benefit)

MyData_long <- pivot_longer(MyData, -c(Articles, Samples_number), names_to = "response")

ggplot(MyData_long, aes(x = Articles, y = Samples_number, fill= response)) +
  theme_bw() +
  geom_col(position = "stack")

Created on 2022-02-09 by the reprex package (v2.0.1)

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
  • Thank you so much sir. – Programming Noob Feb 08 '22 at 23:01
  • 1
    You're welcome. Also, you might want to double check you're not counting the same patients twice, e.g. Eli Van Allen and David Braun have coauthored a number of papers, so check that "Braun" (n=311) doesn't contain the "Van Allen" (n=39) patients. – jared_mamrot Feb 08 '22 at 23:13
1

This type of problem is usually a data reshaping problem. Reshape to long format and the plot a bar graph with geom_col.

Articles <- c("Nadeem Riaz", "David Braun","immotion150", "IMVIGOR210", "Alexander Lozano",
              "Van Allen", "Alexandra Pender", "David Lui", "Jae Cho")
Samples_number <- c(49, 311, 247, 298, 47, 39, 82, 121, 16)
With_Benefit <- c(26, 89, 168, 131, 27, 13,17,65, 5)
No_Benefit <- c(13, 102, 79, 167, 20, 26, 65, 56, 11)

MyData <- data.frame(Articles, Samples_number, With_Benefit, No_Benefit)

library(ggplot2)

MyData |>
  tidyr::pivot_longer(c(With_Benefit, No_Benefit)) |>
  ggplot(aes(Articles, value, fill = name)) +
  geom_col(position = position_stack()) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Created on 2022-02-08 by the reprex package (v2.0.1)

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66