2

I've got a dataset with 3 columns but I need a generic function which can be apply to a dataset with more columns :

library(ggplot2)
library(plyr)
library(dplyr)

baseball[, c(2,6)]
D <- ddply(baseball[, c(2,6)], .(year), summarise, 
      g = sum(g, na.rm = T))
D$gCumulate <- round(cummean(D$g))

I want the same plot as :

g <- ggplot(D, aes(D[,1]))
g <- g + geom_line(aes(y=D[,2]), color="black")
g <- g + geom_line(aes(y=D[,3]), color="red")

And I don't understand why this works :

plotSimpleFuntion <- function(data=D, type, color) {
  names <- colnames(data)
  colnames(data) <- letters[1:length(data)]
  g <- ggplot(data, aes(data[,1]))
  if(type[1] == "line"){g <- g + geom_line(aes(y=data[,2]), color=color[1])}
  if(type[1] == "bar"){g <- g + geom_bar(aes(y=data[,2]), stat = 'identity',fill=color[1])}
  if(type[1] == "point"){g <- g + geom_point(aes(y=data[,2]), color=color[1])}
  if(type[1] == "area"){g <- g + geom_area(aes(y=data[,2]), fill=color[1])}
  if(type[2] == "line"){g <- g + geom_line(aes(y=data[,3]), color=color[2])}
  if(type[2] == "bar"){g <- g + geom_bar(aes(y=data[,3]), stat = 'identity',fill=color[2])}
  if(type[2] == "point"){g <- g + geom_point(aes(y=data[,3]), color=color[2])}
  if(type[2] == "area"){g <- g + geom_area(aes(y=data[,3]), fill=color[2])}
  g + xlab(names[1])
}

plotSimpleFuntion(D, type = c("line", 'line'), color = c("black","red"))

And not that :

plotSimpleFuntion <- function(data=D, type, color) {
  names <- colnames(data)
  colnames(data) <- letters[1:length(data)]
  g <- ggplot(data, aes(data[,1]))
  for (i in 1:2) {
    if(type[i] == "line"){g <- g + geom_line(aes(y=data[,i+1]), color=color[i])}
    if(type[i] == "bar"){g <- g + geom_bar(aes(y=data[,i+1]), stat = 'identity',fill=color[i])}
    if(type[i] == "point"){g <- g + geom_point(aes(y=data[,i+1]), color=color[i])}
    if(type[i] == "area"){g <- g + geom_area(aes(y=data[,i+1]), fill=color[i])}
  }
  g + xlab(names[1])
}

plotSimpleFuntion(D, type = c("line", 'line'), color = c("black","red"))
Mâlo
  • 21
  • 2
  • Does this help: [ggplot does not work if it is inside a for loop although it works outside of it](https://stackoverflow.com/questions/15678261/ggplot-does-not-work-if-it-is-inside-a-for-loop-although-it-works-outside-of-it) – markus Sep 17 '20 at 09:50
  • 1
    Hi @Malo. Good question. Unfortunately I lack the deep understanding of the issue to give a convincing answer. However, as far as I get it, it has to do with when the expression `data[,i+1]` gets evaluated and what's the value of `i`at this time. Put differently `data[,i+1]` gets only evaluated when g is plotted and the value used in the evaluation is the last value of `i`, i.e. i == 2. That's probably one of the reasons why the use of the dataset name is discouraged in ggplot2. One way to make your code work is to use the `.data` pronoun, i.e. replace `data[, i+1]` by `.data[[letters[i+1]]]` – stefan Sep 17 '20 at 09:59

1 Answers1

1

I'm pretty sure the issue is that data[i + 1] etc. gets evaluated when the ggplot object is printed, i.e., after the loop. When this happens, i has the value 2.

Anyway, your approach is not using ggplot2 as intended. Here is an example how it should be used. Some more work would be needed to implement exactly what you are intending but I don't think you should do that anyway.

library(reshape2)
plotSimpleFuntion <- function(data=D, type, color) {
  n <- names(D)
  data <- melt(data, id.var = n[[1]])
  geom <- switch(type,
                 line = geom_line,
                 bar = geom_col,
                 point = geom_point,
                 area = geom_area)
  ggplot(data, aes_string(x = names(data)[[1]], y = "value", 
                          color = "variable", fill = "variable")) + 
    geom() +
    scale_color_manual(breaks = n[-1], values = color) +
    scale_fill_manual(breaks = n[-1], values = color)
}

plotSimpleFuntion(D, type = "line", color = c("black","red"))

resulting plot

This follows the grammar of graphics as intended but can't mix different geoms. You'll need an if condition to handle that by different code (e.g., without the melt). I'd advise against it.

Roland
  • 127,288
  • 10
  • 191
  • 288