0

first you need to load these packages:

library(ggplot2)
library(ggrepel)

I have a dataframe "dframe" like this:

V1          V2           V3          V4          V5          V6          V7          Groups
0.05579838 -0.44781204 -0.164612982 -0.05362210 -0.23103516 -0.04923499 -0.06634579      1
0.14097924 -0.35582736  0.385218841  0.18004788 -0.18429327  0.29398646  0.69460669      2
0.10699838 -0.38486299 -0.107284020  0.16468591  0.48678593 -0.70776085  0.20721932      3
0.22720072 -0.30860464 -0.197930310 -0.24322096 -0.30969028 -0.04460600 -0.08420536      4
0.24872635 -0.23415141  0.410406163  0.07072146 -0.09302970  0.01662256 -0.21683816      5
0.24023670 -0.27807097 -0.096301697 -0.02373198  0.28474825  0.27397862 -0.29397324      6
0.30358363  0.05630646 -0.115190308 -0.51532428 -0.08516130 -0.08785924  0.12178198      7
0.28680386  0.07609196  0.488432908 -0.13082951  0.00439161 -0.17572986 -0.25854047      8
0.30084361  0.06323714 -0.008347161 -0.26762137  0.40739524  0.22959024  0.19107494      9
0.27955675  0.22533959 -0.095640072 -0.27988676 -0.04921808 -0.10662521  0.19934074     10
0.25209125  0.22723231  0.408770841  0.13013867 -0.03850484 -0.23924023 -0.16744745     11
0.29377963  0.13650247 -0.105720288 -0.00316556  0.29653723  0.25568169  0.06087608     12
0.24561895  0.28729625 -0.167402464  0.24251060 -0.22199262 -0.17182828  0.16363196     13
0.25150342  0.25298115 -0.147945172  0.43827820  0.02938933  0.01778563  0.15241257     14
0.30902922 -0.01299330 -0.261085058  0.13509982 -0.40967529 -0.11366113 -0.06020937     15
0.28696274 -0.12896680 -0.196764195  0.39259942  0.08362863  0.25464125 -0.29386260     16

Here is a reproducible dataframe that you can use from Mark Peterson:

dframe <-
  rnorm(70) %>%
  matrix(nrow = 10) %>%
  as_tibble() %>%
  setNames(paste0("V", 1:ncol(.))) %>%
  mutate(Groups = 1:nrow(.)
         , Label = 1:nrow(.))

I created a table of combinations of columns I want to be used from my dataframe:

#Create all possible combinations
combs<-expand.grid(seq(7),seq(7))
#Remove duplicate and order
combs<-combs[combs$Var1 != combs$Var2,]
combs<-combs[order(combs[,1]),]

then I made a for loop supposed to generate a list of ggplots, 1 plot by combination:

list_EVplots<-list()
  for(i in seq(nrow(combs))){
    list_EVplots[[paste(combs[i,1],"&",combs[i,2])]]<- ggplot(data=dframe) +
      ggtitle(paste("Eigenvector Plot - Pairwise",
                    "correlation with","adjustment")) +
      geom_point(aes(x = dframe[,combs[i,1]], y = dframe[,combs[i,2]],
                     color = Groups)) +
      geom_segment(aes(x = rep(0,nrow(dframe)), y = rep(0,nrow(dframe)),
                       xend = dframe[,combs[i,1]], yend = dframe[,combs[i,2]],
                       color = Groups),
                   size = 1, arrow = arrow(length = unit(0.3,"cm"))) +
      geom_label_repel(aes(x = dframe[,combs[i,1]], y = dframe[,combs[i,2]],
                           label = rownames(dframe))) +
      scale_color_manual(values=colors) +
      xlab(paste0("Eigenvector ",combs[i,1])) +
      ylab(paste0("Eigenvector ",combs[i,2])) +
      theme(plot.title = element_text(hjust = 0.5),
            axis.title = element_text(size = 13),
            legend.text = element_text(size=12)) +
      geom_hline(yintercept = 0, linetype="dashed") +
      geom_vline(xintercept = 0, linetype="dashed")
  }

After running this for loop, I obtain my list "list_EVplots". Problem: iterations seem to work for xlab() and ylab(), it also work for the names of plots in the list, but the coordinates of geom_point(aes()) and geom_segment(aes()) do not change. Coordinates stay the same when they obviously should change! I think the coordinates stay locked on the one used for the first plot of the first iteration. If anyone has the solution for that I would be very grateful for your help.

Working under Linux 16.04 with R Studio. R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"

I tried with a subsetted dataframe with only the columns I wanted to work with instead of using an 8 columns dataframe: didn't work.

Expected: The list should contain different plots: all plots should be different.

Problem: All plots have the same coordinates for dots and segments in the list.

Yoann Pageaud
  • 412
  • 5
  • 22
  • Can you provide a [reproducible](https://stackoverflow.com/a/5963610/8918309) example so that people can test their answers before posting? And as it comes to your question, my first guess would be to use the `ggplotGrob` function, i.e. convert the output of `ggplot() + ...` to a `grob` before assigning it to `list_EVplots[[paste(combs[i,1],"&",combs[i,2])]]` – MRau Jan 31 '19 at 16:09
  • I thought it was reproducible. Can you tell me what you are missing ? I can edit accordingly then. – Yoann Pageaud Jan 31 '19 at 16:10
  • Ok, so sample data as output from `dput()` would be nice. And it is always helpful if you try to reduce your code at most possible to reproduce the problem. Now I'm getting error `could not find function "geom_label_repel"`. After reducing your code to omit this missing function I get `Error: 'data' must be uniquely named but has duplicate columns`. – MRau Jan 31 '19 at 16:18
  • 1
    @MRau I edited the post. Hope it helps! – Yoann Pageaud Jan 31 '19 at 16:29

1 Answers1

1

The simplest answer is often the easiest one: try to avoid using for loops in places where lapply is more appropriate. I don't see anything obvious in your code that suggests where the problem lies, but I am guessing that it is a problem in the deeply nested [] statements.

Here is an approach using lapply and aes_string to handle the variables. If you want something other than a full pairwise set of plots, you may have to modify the calls to the two lapply's a bit.

First, some reproducible data (made using dplyr). Note that I made the Labels explicit instead of relying on the rownames (this is good practice, and far easier to use in calls to ggplot).

dframe <-
  rnorm(70) %>%
  matrix(nrow = 10) %>%
  as_tibble() %>%
  setNames(paste0("V", 1:ncol(.))) %>%
  mutate(Groups = 1:nrow(.)
         , Label = 1:nrow(.))

Then, I am pulling out the columns that you want to use for your plots. I am naming them so that the returned list has the column names automatically assigned.

my_cols <-
  names(dframe)[1:7] %>%
  setNames(.,.)

Then, just set up a nested lapply to work through all of the pairwise comparisons:

plot_list <-
  lapply(my_cols, function(col1){
    lapply(my_cols, function(col2){

      if(col1 == col2){
        return(NULL)
      }

      ggplot(dframe) +
        ggtitle(paste("Eigenvector Plot - Pairwise",
                      "correlation with","adjustment")) +
        geom_point(aes_string(x = col1
                              , y = col2
                              , color = "Groups")) +
        geom_segment(aes_string(xend = col1
                                , yend = col2
                                , color = "Groups")
                     , x = 0
                     , y = 0
                     , size = 1
                     , arrow = arrow(length = unit(0.3,"cm"))) +
        geom_label_repel(aes_string(x = col1
                                    , y = col2
                                    , label = "Label")) +
        xlab(paste0("Eigenvector ", col1)) +
        ylab(paste0("Eigenvector ", col2)) +
        theme(plot.title = element_text(hjust = 0.5),
              axis.title = element_text(size = 13),
              legend.text = element_text(size=12)) +
        geom_hline(yintercept = 0, linetype="dashed") +
        geom_vline(xintercept = 0, linetype="dashed")

    })
  })

Note that you did not include the colors that you wanted to use for the groups, so I left the defaults instead.

The plots come out correctly and this should be easier to work through.

Mark Peterson
  • 9,370
  • 2
  • 25
  • 48
  • "Oh, hi Mark !" - No seriously: thank you so much for this fantastic answer. I adapted it to my code and it works perfectly now. I admit I am not a big fan of dplyr, I would have prefer a solution with data.table, but I won't be picky, it does the job very well. So to you the issue was coming from ggplot right ? Thanks again. – Yoann Pageaud Jan 31 '19 at 17:39
  • I think that the issue was caused by trying to pass data -- instead of column names -- to ggplot. That seems to often lead to trouble. – Mark Peterson Feb 01 '19 at 17:23
  • Yes I agree, but I don't get why I only had issues for iteration on the coordinates used in the plot, and not in the axes labels. It's like if ggplot was aonly able to iterate for a part of itself. If it is really the case, the issue I encountered does not sound so trivial after all. – Yoann Pageaud Feb 02 '19 at 19:58
  • 1
    I played around a bit more, and it appears that it is using the coordinates for the *current* value of `i`. If you change the value of `i`, you will get the plot that matches that index. The ggplot object is not evaluating the data argument until it renders, at which point it is grabbing the current, global value of `i`. The same is apparently not true for the xlab and ylab functions. You can see some of this if you inspect the `ggplot` object more closely (e.g., with `str(list_EVplots[[1]])`) – Mark Peterson Feb 04 '19 at 13:53