-1

I have a dataframe with 30 columns and I would like to create 30 (gg)plots based on these columns. When creating a plot through ggplot, you have to create a variable to which all the information of the plot is added.

Is there a way how I can create 30 of such variable names in a for loop (so that I don't have to create and store them all locally?

In earlier code I repeated the below steps 30 times:

In earlier code, I had the following:

a1 = ggplot(data = results_round_one,
            aes(results_round_one$`R-0,01`)) 
a1 = a1 + geom_histogram()
a1 = a1 + xlim(0.46, 0.55)
a1 = a1 + geom_vline(xintercept= mean(results_round_one$`R-0,01`),
                     col = 'blue')
a1 = a1 + geom_vline(xintercept = max(results_round_one$`R-0,01`),
                     col = 'red')
a1= a1 + labs(y = 'Frequency', 
              x= 'Validated accuracy', 
              title = 'Optimizer = RMSProp', 
              subtitle = 'Learning rate = 0.01')

However, since I only have to change the aes and the labels, I think I should be able to do this process in a for loop as well.

Emil
  • 1,531
  • 3
  • 22
  • 47

2 Answers2

0

You could apply histogram function:

getImage <- function(col){
  a1 = ggplot(data = results_round_one,
              aes(results_round_one[, col])) +
  geom_histogram() + 
  xlim(0.46, 0.55) +
  geom_vline(xintercept= mean(results_round_one[, col]),
                       col = 'blue') +
  labs(y = 'Frequency', 
                x= 'Validated accuracy', 
                title = 'Optimizer = RMSProp', 
                subtitle = 'Learning rate = 0.01')
  return(a1)
}

to a vector of columns iteratively. In this case col_30 is a vector of column names

# e.g. col_30 = c("col1", "col2") etc.
for(col in col_30){
  getImage(col)
}

This would generate different plots.

Aleksandr
  • 1,814
  • 11
  • 19
  • In your example the variable (`a1`) and the frequent reassignment isn’t necessary at all, and it’s actually a bit confusing. Furthermore, this use of `aes` will yield subtly suboptimal results. It’s better to use `aes_string` with the name of the column instead. – Konrad Rudolph Jul 22 '18 at 18:29
  • I would suggest to use `<-` instead of `=` and no need to have multiple equations if you can get away with one – see-king_of_knowledge Jul 22 '18 at 18:29
  • 2
    @see-king_of_knowledge There’s absolutely no reason to use `<-` over `=`. It’s purely a stylistic choice. That said, the code should be *consistent*; i.e. use one everywhere, no mixing (as done in this answer). – Konrad Rudolph Jul 22 '18 at 18:30
  • @KonradRudolph, yeap, I would love to hear a reasonable motivation why assignment operator is better than equation sign. – Aleksandr Jul 22 '18 at 18:32
  • @KonradRudolph, I agree that in this case it will not make a difference and besides being consistent it is practically indifferent. But the fine difference can be found [here](https://stackoverflow.com/a/1742550/5184851) – see-king_of_knowledge Jul 22 '18 at 18:41
  • @see-king_of_knowledge There's no relevant difference. The accepted answer is pure post hoc rationalising, and most other answers there are wrong. Here's a better discussion: https://www.reddit.com/r/rlanguage/comments/47c3z7/_/d0bw0w3 — this is an unfortunate weak point of the R community. – Konrad Rudolph Jul 22 '18 at 20:28
0

In absence of some example data, here is some code that would loop through iris columns, creating density plots:

library(purrr)
library(dplyr)

df <- iris %>%
  select(Sepal.Length:Petal.Width)

df %>% 
  map2(names(df), ~ .x %>% 
         as.data.frame %>% 
         set_names(.y) %>% 
         ggplot(aes_string(.y)) + geom_density() + ggtitle(.y))

Using your code, something along the lines of:

results_round_one %>% 
  map2(names(results_round_one), ~ .x %>% 
         as.data.frame %>% 
         set_names(.y) %>% 
         ggplot(aes_string(.y)) +
           geom_histogram() +
           xlim(0.46, 0.55) +
           geom_vline(xintercept = mean(.x), col = 'blue') +
           geom_vline(xintercept = max(.x), col = 'red') +
           labs(y = 'Frequency', 
               x= 'Validated accuracy',
               title = 'Optimizer = RMSProp', 
               subtitle = 'Learning rate = 0.01'))   
Vlad C.
  • 944
  • 7
  • 12