0

Apparently similar code generates different results when it is abstracted in a for-loop.

This is header code common to both alternatives:

library(ggplot2)
library(gridExtra)

# factor common plot format elements
o2b <- colorRampPalette(c("brown", "orange"))(4)
textsize=8
pt <- theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank(), 
           panel.background=element_blank(), panel.border=element_blank(), 
           plot.title=element_blank(), plot.margin = unit(c(5.5,12,5.5,5.5), "pt"), 
           legend.background=element_blank(), legend.key=element_blank(), legend.position=c(1,1), 
           legend.justification=c(1,1), legend.text=element_text(size=textsize), legend.title=element_text(size=textsize), 
           axis.line=element_line(colour="black"), axis.text=element_text(size=textsize, colour="black"), 
           axis.title=element_text(size=textsize))

1) This is the code without a for-loop:

# fill list of plots of 4 scatterplot rank-size distributions
p <- list()
# each plot differs by setting x values for all scatterplots to the value of x in one of the scatterplots, by turn
p[[1]] <-ggplot(temp1, aes(x=rep(rank[naics_level==2], 4), y=rhh, colour=factor(naics_level))) + pt + 
  geom_point(shape=1, size=1) + scale_color_manual(values=o2b) + 
  guides(colour = guide_legend(title="Niveau de NAICS", title.position = "left", reverse=T)) +  
  labs(x="Rang des MSA", y="Diversité sectorielle basée sur l'emploi en 2015")
p[[2]] <-ggplot(temp1, aes(x=rep(rank[naics_level==3], 4), y=rhh, colour=factor(naics_level))) + pt + 
  geom_point(shape=1, size=1) + scale_color_manual(values=o2b) + 
  guides(colour = guide_legend(title="Niveau de NAICS", title.position = "left", reverse=T)) +  
  labs(x="Rang des MSA", y="Diversité sectorielle basée sur l'emploi en 2015")
p[[3]] <-ggplot(temp1, aes(x=rep(rank[naics_level==4], 4), y=rhh, colour=factor(naics_level))) + pt + 
  geom_point(shape=1, size=1) + scale_color_manual(values=o2b) + 
  guides(colour = guide_legend(title="Niveau de NAICS", title.position = "left", reverse=T)) +  
  labs(x="Rang des MSA", y="Diversité sectorielle basée sur l'emploi en 2015")
p[[4]] <-ggplot(temp1, aes(x=rep(rank[naics_level==5], 4), y=rhh, colour=factor(naics_level))) + pt + 
  geom_point(shape=1, size=1) + scale_color_manual(values=o2b) + 
  guides(colour = guide_legend(title="Niveau de NAICS", title.position = "left", reverse=T)) +  
  labs(x="Rang des MSA", y="Diversité sectorielle basée sur l'emploi en 2015")
library(gridExtra)
grid.arrange(grobs=p, nrow=2)

which results in:

enter image description here

2) This is the same code in a for-loop:

p <- list()
for (n in 1:4) {
  p[[n]] <- ggplot(temp1, aes(x=rep(rank[naics_level==n+1], 4), y=rhh, colour=factor(naics_level))) + pt + 
    geom_point(shape=1, size=1) + scale_color_manual(values=o2b) + 
    guides(colour = guide_legend(title="Niveau de NAICS", title.position = "left", reverse=T)) +  
    labs(x="Rang des MSA", y="Diversité sectorielle basée sur l'emploi en 2015")
}
grid.arrange(grobs=p, nrow=2)

which results in:

enter image description here

This time, all plots are duplicates of the 4th plot obtained by the first method. Where am I going wrong?

syre
  • 902
  • 1
  • 7
  • 19
  • 1
    can you dput the data to have a reproducible example please? – Robin Gertenbach Sep 29 '17 at 09:17
  • 1
    Try what `for (n in 1:4) print(rep(rank[naics_level==n+1], 4))` gives you. Is this what you expect it to give? – Roman Luštrik Sep 29 '17 at 09:51
  • @RobinGertenbach I'm trying but pastebin is down right now. Is there an alternative? – syre Sep 29 '17 at 10:18
  • maybe a subset of the data that you can post directly here – Robin Gertenbach Sep 29 '17 at 10:23
  • 1
    Have you tried putting `x=rep(rank[naics_level==n+1]` between `{` and `p[[n]]` and using `aes(x=x)` and what does the list `p[[n]]` look like after the loop – Olivia Sep 29 '17 at 10:23
  • @RomanLuštrik Yes. There are 4 rank-size distributions. What I want to do is visually compare the rankings by plotting all 4 distributions according to 1 out of 4 rankings, by turn. This code tells R to duplicate one of the rankings for all 4 distributions. Since the underlying data is sorted by naics_level and entity id, the ranks will be applied to the correct entities at each naics_level. – syre Sep 29 '17 at 10:24
  • 1
    @Olivia I followed your advice and did this `for (n in 1:4) { temp <- temp1; temp$rank <- with(temp1, rep(rank[naics_level==n+1], 4)); p[[n]] <- ggplot(temp, aes(x=rank, y=rhh, colour=factor(naics_level))) + ...` which works like a charm! – syre Sep 29 '17 at 10:35
  • 2
    Within the loop, try converting each ggplot object into a grob & save *that* to your list instead. I've answered a similar question recently [here](https://stackoverflow.com/questions/46417470/saving-ggplot-in-a-list-gives-me-the-same-graph/46430284#46430284). – Z.Lin Sep 29 '17 at 10:37
  • 1
    [This answer](https://stackoverflow.com/a/26246791/2461552) gives a nice explanation of ggplot2 and lazy evaluation. It looks like you can use `aes_` in place of `aes` to force evaluation as shown [here](https://stackoverflow.com/a/44317976/2461552) – aosmith Sep 29 '17 at 18:43

0 Answers0