0

I have a dataset of 8 columns and 152 rows. My aim is to create a barplot using ggplot2 of each column's means and their standard deviation (these vary quite a lot). I can create a scatter plot easily, but barplot comes with multiple error messages including:

Error in barplot.default(GRPA) : 'height' must be a vector or a matrix Any suggestions or example code would be great

Example of part of data:

structure(list(ALP.B = c(80L, 37L, 52L, 36L, 39L, 48L, 71L, 81L, 
77L, 38L, 56L, 33L, 64L, 70L, 43L, 45L, 59L, 42L, 59L, 45L), 
    ALT.B = c(13L, 15L, 10L, 13L, 18L, 8L, 12L, 13L, 18L, 13L, 
    10L, 28L, 10L, 13L, 12L, 28L, 15L, 7L, 11L, 13L), AST.B = c(14L, 
    16L, 13L, 13L, 12L, 13L, 18L, 16L, 19L, 14L, 15L, 21L, 15L, 
    13L, 12L, 16L, 23L, 12L, 14L, 12L), TBL.B = c(12.654, 6.498, 
    4.788, 6.84, 14.364, 6.156, 9.063, 10.773, 7.353, 7.182, 
    7.866, 8.721, 13.338, 7.866, 11.628, 10.089, 5.301, 9.918, 
    7.353, 7.182), ALP.M = c(87L, 37L, 55L, 35L, 37L, 50L, 74L, 
    89L, 83L, 36L, 58L, 32L, 78L, 78L, 43L, 51L, 60L, 47L, 50L, 
    51L), ALT.M = c(22L, 25L, 10L, 11L, 21L, 8L, 10L, 17L, 21L, 
    16L, 13L, 27L, 14L, 18L, 13L, 41L, 14L, 8L, 13L, 14L), AST.M = c(22L, 
    23L, 13L, 12L, 15L, 13L, 15L, 13L, 22L, 17L, 18L, 27L, 16L, 
    15L, 13L, 23L, 22L, 12L, 13L, 15L), TBL.M = c(23.085, 8.037, 
    6.498, 8.037, 16.758, 5.985, 7.524, 7.866, 8.379, 7.866, 
    8.208, 13.338, 15.732, 8.208, 14.706, 15.39, 7.866, 7.353, 
    9.918, 7.866)), row.names = c(NA, 20L), class = "data.frame")

My code is rudimental, as i have tried so many:

ggplot(colMeans(GRPA), aes(x="drug", y="value")) + 
  geom_bar(stat = "identity")
stefan
  • 90,330
  • 6
  • 25
  • 51
  • 1
    Welcome to SO! To help us to help you could you please make your issue reproducible by sharing a sample of your **data**, the **code** you tried and the **packages** you used? See [how to make a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – stefan Jan 07 '21 at 13:43
  • have updated please let me know if alright – Please help Jan 07 '21 at 13:51
  • If you want to post data simply type `dput(NAME_OF_DATASET)` into the console and copy & paste the output starting with `structure(....` into your post. If your dataset has a lot of observations you could do `dput(head(NAME_OF_DATASET, 20))` for the first twenty rows of data. Additionally please post the **code** you have tried and which causes the issue. – stefan Jan 07 '21 at 13:54
  • sorry this really isn't working haven't got a clue – Please help Jan 07 '21 at 14:01
  • (: Everything fine. Now we have the data as nice dput(). That's great. How about the code? Simply copy the code and add it to your code (an best: format it as code).(; – stefan Jan 07 '21 at 14:04

1 Answers1

1

There are several issues with your code. First ggplot2 works on data frames while you pass it a vector colMeans(GRPA). Additionally if you want pass ggplot2 the names of variables do that without quotes.

To achieve your desired result it's best to reshape your dataset into long or tidy format using e.g. tidyr::pivot_longer(). Afterwards you may use dplyr to compute the means (and/or standard deviation) per drug:

This summarised dataset can then be easily plotted via ggplot2.

library(dplyr)
library(tidyr)
library(ggplot2)

# Reshape dataset to long format, compute means per drug using group_by + summarise
GRPA_long <- GRPA %>% 
  pivot_longer(everything(), names_to = "drug", values_to = "value") %>% 
  group_by(drug) %>% 
  summarise(mean = mean(value), sd = sd(value))
#> `summarise()` ungrouping output (override with `.groups` argument)

ggplot(GRPA_long, aes(x = drug, y = mean)) + 
  geom_bar(stat = "identity") + 
  geom_errorbar(aes(ymin = mean - sd, ymax = mean + sd))

stefan
  • 90,330
  • 6
  • 25
  • 51