2

I have designed an experiment to see how serum markers change with time since eating a meal. I have a data frame consisting of 72 observations and 23o variables this is called BreakfastM.

There are 229 variables which are serum markers and 1 which is timepoint. The observations are different samples

Iam looking for trends in the data of how the serum markers (ie cholestrol) change with the timepoint. I have created a boxplot which shows nicely the trends in a particular serum marker in relation to timepoint

This is the code I used

boxplot((BreakfastM$Variable~BreakfastM$Timepoint))

Is there a quick way to test all the variables in the dataframe against the timepoint by writing a loop code in R?

Frank
  • 66,179
  • 8
  • 96
  • 180
NLM09
  • 25
  • 1
  • 1
  • 4
  • 3
    It's a little unclear what you're asking for - are you looking for a loop that will create 229 boxplots? – MeetMrMet Nov 01 '16 at 18:53
  • 1
    Thank you for replying. There are nine time points and if I run the command above for the variable "cholesterol", I produce 1 graph with 9 box plots (one for each time point). I would like to produce 229 graphs, by substituting the variable, to gain a quick look at the data. Hope this makes sense. – NLM09 Nov 01 '16 at 21:13
  • In RStudio, the answer by Konrad Rudolph with lapply here: https://stackoverflow.com/questions/31993704/storing-ggplot-objects-in-a-list-from-within-loop-in-r worked for me, if I add a `print(myplots)` in the end. This will plot the plots one after another, so you need to navigate through them. In R-Markdown files, it's a bit easier to navigate. – dasWesen Apr 01 '22 at 10:35

2 Answers2

5

If you are just looking to plot, converting to long form with tidyr (and dplyr) and then plotting with ggplot2 is probably the best starting point.

If you have only a small number of variables, you could just use facet_wrap to split the boxplots by measure. Because you didn't provide reproducible data, I am using the mtcars data, substituting "gear" for your time point, and limiting to just the numeric values to compare. select is picking the columns I want to use, then gather converts them to long format before passing to ggplot

mtcars %>%
  select(gear, mpg, disp:qsec) %>%
  gather(Measure, Value, -gear) %>%
  ggplot(aes(x = factor(gear)
             , y = Value)) +
  geom_boxplot() +
  facet_wrap(~Measure
             , scales = "free_y")

enter image description here

Now, with 229 variables, that is not going to be a readable plot. Instead, you may want to look at facet_multiple from ggplus which spreads facets over multiple pages. Here, I am using it to put one per "page" which you can either view in the viewer, or save, depending on your needs.

First, save the base plot (with no facetting):

basePlot <-
  mtcars %>%
  select(gear, mpg, disp:qsec) %>%
  gather(Measure, Value, -gear) %>%
  ggplot(aes(x = factor(gear)
             , y = Value)) +
  geom_boxplot()

Then, use it as an argument to facet_multiple:

facet_multiple(basePlot, "Measure"
               , nrow = 1
               , ncol = 1
               , scales = "free_y")

Will generate the same panels as above, but with one per page (changing nrow and ncol can increase the number of facets shown per page).

Mark Peterson
  • 9,370
  • 2
  • 25
  • 48
  • **Is there a quick way to test all the variables in the dataframe against the timepoint by writing a loop code in R?** where do you get "translate my code to dplyr and ggplot" from that? – rawr Nov 01 '16 at 19:30
  • @rawr : From "quick way" -- this is (or can be) substantially less code to make it flexible for any desired number of variables. It generates the desired output, but is certainly only one approach. If you think there is a quicker way without it, I'd be happy to see it. I don't think `dplyr` or `ggplot` are the right tool for everything, and they may be wrong here. However, this is a toy I've played with recently, and thought it might be useful. – Mark Peterson Nov 01 '16 at 19:37
  • okay but the "test all the variables ... against the timepoint" was the question, nothing about plotting – rawr Nov 01 '16 at 19:46
  • The only example of code shown is a boxplot, and at least one other commenter asked if all OP wanted was a set of boxplots. Note that the top of my answer says "If you are just looking to plot, ..." I took a bit of liberty to interpret what OP intended based on the lack of clarity in phrasing. If ze actually wants a test, ze will need to clarify (e.g., ANOVA vs lm), and my answer will not address that. If ze only wants a quick visual look (which is likely a better idea, given the multiple testing involved here), then this is a viable option. If I am wrong, I'm wrong, and will happily delete. – Mark Peterson Nov 01 '16 at 19:55
3

You can also use a loop to write many plots to image files in your working directory. Let's make a 10 column matrix representing 10 measured variables, each split by 3 factor levels:

data <- matrix(rnorm(150), nrow=15)
grps <- factor(c(rep("group1", 5), rep("group2", 5), rep("group3", 5)))

The loop writes each boxplot to files called var_1.png, var_2.png, etc. This will put 10 pngs in your working directory.

for (i in 1:ncol(data)) {
  png(file = paste("var_", i, ".png", sep=""))
  boxplot(data[, i] ~ grps)
  dev.off()
}

The files are very small and you can flick through them quickly with a simple image viewer.

enter image description here

Joe
  • 8,073
  • 1
  • 52
  • 58